Home |  Previous |  Next |  Print |  Contact

 Introduction

  
 Acknowledgments
 Preface
 Introduction
 
 What is a grid?
 Is it a grid or a cluster?
 What instruments, resources and services might you find on a grid?
 Who can access grid resources?
 Bibliography
 History, Standards & Directions
 What Grids Can Do For You
 Grid Case Studies
 Current Technology for Grids
 Programming Concepts & Challenges
 Joining a Grid: Procedures & Examples
 Typical Usage Examples
 Related Topics
 My Favorite Tips
 Glossary
 Appendices
 Use of This Material
 

Introduction


What is a grid?

Grid technologies represent a significant step forward in the effective use of network-connected resources, providing a framework for sharing distributed resources while respecting the distinct administrative priorities and autonomy of the resource owners. A grid can also help people discover and enable new ways of working together — providing a means for resource owners to trade unused cycles for access to significantly more compute power when needed for short periods, for example, or establishing a new organizational or cultural paradigm of focused investments in common infrastructure that is made available for broad benefit and impact.

Arriving at a common definition of "a grid" today can be very difficult. Perhaps the most generally useful definition is that a grid consists of shared heterogeneous computing and data resources networked across administrative boundaries. Given such a definition, a grid can be thought of as both an access method and a platform, with grid middleware being the critical software that enables grid operation and ease-of-use. For a grid to function effectively, it is assumed that

  • hardware and software exists on each resource to support participation in a grid and,
  • agreements and policies exist among grid participants to support and define resource sharing.

Standards to define common grid services and functionality are still under development. The promise of the transparent and ubiquitous resource sharing has excited and inspired a variety of views of a grid, often with considerable hype, from within multiple sectors (academe, industry, government) and flavored by numerous perspectives.

Many products are available for implementing "a grid", or grid-like capabilities. In some cases, the focus is on providing high performance capability, either through eased or increased access to existing high performance computing (HPC) resources, or a new level of performance realized through the orchestration of existing resources. In other cases, the focus is on using the network coupled with grid middleware to provide users or applications with seamless access to distributed resources of varying types, often in the service of solving a single problem or inquiry. With both standards and products under rapid development, product selection inevitably affects the definition of the resulting grid — that is, any given grid is at least partially defined by the functionality, focus and features of the product(s) that are used to implement it. Throughout this Cookbook, high level concepts and general examples will consider a variety of "grid types" but specific examples and case studies necessarily reflect particular products and approaches, with emphasis on those most commonly implemented today.

When grid technology is viewed as evolving into a generalized and globally shared infrastructure (a "grid of grids", comprised of campus grids, projects grids, regional grids, institutional or organizational grids, etc.), the vision is often referred to as "the Grid", still only a concept but similar in many ways to today's Internet, which evolved from distributed IP networks loosely united to provide a globally-used capability.


Is it a grid or a cluster?

Clusters are often compared to, and confused with, grids. A cluster can be defined as a group of computers coupled together through a common operating system, security infrastructure and configuration that are used as a group to handle users' computing jobs. Clusters fall into a variety of categories, including the following.

  • High performance computing (HPC) clusters provide a cost-effective capability that rivals or exceeds the performance of large shared-memory multiprocessors for many applications. Such clusters typically consist of thousands, tens of thousands, or hundreds of thousands of compute elements (i.e., processors or cores) and a high performance network (e.g., Myrinet, Infiniband, etc.) that is substantially more efficient than Ethernet.
  • Beowulf clusters comprised of commodity-hardware compute nodes running Linux software and with dedicated interconnects (and similar architectures using other operating systems.)
  • "Cycle-scavenging" services (aggregating and scheduling access to compute cycles that would otherwise go unused on individual systems, not necessarily running the same operating system (e.g., Condor pools).

For the purposes of this cookbook, a grid is assumed to consist of at least two such systems that connect across administrative domains.

A computational grid emphasizes aggregate compute power and performance through its collective nodes. A data grid emphasizes discovery, transfer, storage and management of data distributed across grid nodes.


What instruments, resources and services might you find on a grid?

The predominant impression, or sometimes de facto definition, of a grid is that it is a collection of computational resources that can be combined to produce a greater HPC capability than each resource can provide on its own. In fact, many grids are focused on computation, at least initially, since the concepts and processes for combining computational elements are the most mature and compute-intensive applications are more obviously positioned to benefit from the multiplication of capability made possible by grid technology. A grid, however, can facilitate access to a wide variety of resources, and the type and timing of resources to be added to any given grid depends on the intended use community and application set. Resources other than compute resources may be more obvious or compelling for a particular community to share, such as visualization tools, high-capacity storage, data services, or access to unique or distributed instruments (e.g., telescopes, microscopes, sensors).

The actual process for adding a resource to a grid — or "grid-enabling" the resource — varies according to the type of resource being added as well as the grid technology in use. Compute resources are often the focus of examples within this Cookbook due to their prevalence and relatively straight-forward (or at least common!) inclusion in a grid. Processes to grid-enable other types of resources (e.g. data services, visualization, instruments) are less well known, are likely to be more variable from grid product to grid product, and may also be proprietary or highly dependent on the technical specifications of the particular device.

Some examples that illustrate the value and variety of making different resources available via a grid include:

  • George E. Brown, Jr. Network for Earthquake Engineering Simulation [1] - From their Web site: "NEES is a shared national network of 15 experimental facilities, collaborative tools, a centralized data repository, and earthquake simulation software, all linked by the ultra-high-speed Internet2 connections of NEESgrid. Together, these resources provide the means for collaboration and discovery in the form of more advanced research based on experimentation and computational simulations of the ways buildings, bridges, utility systems, coastal regions, and geomaterials perform during seismic events ... NEES will revolutionize earthquake engineering research and education. NEES research will enable engineers to develop better and more cost-effective ways of mitigating earthquake damage through the innovative use of improved designs, materials, construction techniques, and monitoring tools." The NEES Central portal provides a single launching point for access to a variety of facilities (see NEEScentral web site [20]) including instruments such as geotechnical centrifuges, shake tables and tsunami wave basins.

  • Laser Interferometer Gravitational-Wave Observatory (LIGO) [3] - From their Web site: "The Laser Interferometer Gravitational-Wave Observatory (LIGO) is a facility dedicated to the detection of cosmic gravitational waves and the harnessing of these waves for scientific research...the LIGO Data Grid is being developed with an initial focus on distributed data services — replication, movement, and management — versus high-powered computation. " The gravitational wave detectors produce large amounts of observational data that is analyzed alongside similar scale expected or predicated data by scientists working in this field.

  • Earth System Grid [4] - From their Web site: "The primary goal of ESG is to address the formidable challenges associated with enabling analysis of and knowledge development from global Earth System models. Through a combination of Grid technologies and emerging community technology, distributed federations of supercomputers and large-scale data and analysis servers will provide a seamless and powerful environment that enables the next generation of climate research." Both data resources/services and high performance computational resources are necessary on this grid to meet a primary project objective: "High resolution, long-duration simulations performed with advanced DOE SciDAC/NCAR climate models will produce tens of petabytes of output. To be useful, this output must be made available to global change impacts researchers nationwide, both at national laboratories and at universities, other research laboratories, and other institutions."

  • cancer Biomedical Informatics Grid (caBIG) [5] - From their Web site: "To expedite the cancer research communities, access to key bioinformatics tools, platforms and data, the NCI is working in partnership with the Cancer Center community to deploy an integrating biomedical informatics infrastructure: caBIG (cancer Biomedical Informatics Grid). caBIG is creating a common, extensible informatics platform that integrates diverse data types and supports interoperable analytic tools in areas including clinical trials management, tissue banks and pathology, integrative cancer research, architecture, and vocabularies and common data elements." The current suite of software development toolkits, applications, database technologies, and Web-based applications from caBIG are openly available from their Tools, Infrastructure, Datasets Web site [21], as tools for the target research community but also as models and reusable components for meeting similar service needs in other grid environments.

  • Two notable initiatives are also addressing, at a more general level, the question of how to connect and control instruments in particular within a grid environment:

    • Grid-enabled Remote Instrumentation with Distributed Control and Computation [2] (GRIDCC) — From their Web site: "Recent developments in Grid technologies have concentrated on providing batch access to distributed computational and storage resources. GRIDCC will extend this to include access to and control of distributed instrumentation ... The goal of the GRIDCC project is to build a widely distributed system that is able to remotely control and monitor complex instrumentation.
    • Instrument Middleware Project [6] From their Web site: "The Common Instrument Middleware Architecture (CIMA) project, supported by the National Science Foundation Middleware Initiative, is aimed at "Grid enabling" instruments as real-time data sources to improve accessibility of instruments and to facilitate their integration into the Grid... The end product will be a consistent and reusable framework for including shared instrument resources in geographically distributed Grids."

    Both of the above initiatives are implementing their emerging products and services into actual and specific pilot applications to verify the efficacy and extensibility of their architecture and approach. Between the two initiatives, examples of grid-enabled instrumentation are being further developed in several diverse fields, including electrical and telecommunication grids (those "other grids"!), particle physics, earth observation and geohazard monitoring, meteorology, and x-ray crystallography.


Who can access grid resources?

Authentication (authN) and authorization (authZ) are used together on grids to enforce conditions of use for resources as specified by the resource owner. This is recognized by Foster et al. in describing grid technology as a "resource-sharing technology with software and services that let people access computing power, databases, and other tools securely online across corporate, institutional, and geographic boundaries without sacrificing local autonomy" [11]. A researcher in the higher-education community, for example, may not only be a computer user on their campus's primary network, they may be a user of regional, national, or international resources within grid-based projects. Each grid determines what process and proof is acceptable to identify a user (authentication), and decides what that user is then authorized to access (authorization.)

Authentication (authN) is the act of identifying an individual user through the presentation of some credential. It does not include determining what resources the user can access, which is considered authorization. The process of authentication verifies that a real-world entity (e.g. person, compute node, remote instrument, application process) is who or what its identifier (e.g., username, certificate subject, etc.) claims it to be. In the process, the authentication credentials are evaluated and verified as being from a trusted source and at a particular level of assurance. Examples of credentials include a smartcard, response to a challenge question, password, public-key certificate, photo ID, fingerprint, or a biometric [12] [13] [14]. Authentication is also often referred to as identity management.

Authorization (authZ) refers to the process of determining the eligibility of a properly authenticated entity to perform the functions that it is requesting (access a grid-based application, service, or resource, for instance). The term "authorization" may be applied to the right or permission that is granted, the issuing of the token that proves a subject has that right, or to the token itself (e.g., a signed assertion). Signed assertions and other authorization characteristics are stored for reference in a variety of ways: within a local file system, on an external physical device (e.g. a smartcard), in a separate data system, or within system or enterprise-wide directories [12] [13] [14]. The characteristics that are assessed to determine status or levels of authorization for a given entity are often referred to as "attributes" of that entity.

Organizations contributing to a grid infrastructure develop policies for conditions of use of the grid resources and use authentication and authorization tools to implement those policies. Several types of authentication and authorization mechanisms have been developed or adopted for grids over time and are in active use today. There is not (yet?) consensus on which technologies are or will prove to be most effective, particularly for grids to scale to the level of global infrastructure, or for inter-departmental, inter-institutional, multi-project or multi-purpose grids, in which resources are not governed under the same administrative domain. However, a variety of sound, operational authN/Z approaches do exist. It is valuable to review several options when deciding on an approach to meet immediate as well as future needs of a given grid deployment, keeping in mind that choosing a particular toolkit may lock you into a particular authentication/authorization model.


Bibliography

[1] George E. Brown, Jr. Network for Earthquake Engineering Simulation (http://www.nees.org)
[2] Grid Enabled Remote Instrumentation with Distributed Control and Computation (GRIDCC) (http://www.gridcc.org/)
[3] Laser Interferometer Gravitational-Wave Observatory (LIGO) (http://www.ligo.caltech.edu)
[4] Earth System Grid (http://www.earthsystemgrid.org/)
[5] cancer Biomedical Informatics Grid (caBIG) (http://cabig.cancer.gov/index.asp)
[6] Instrument Middleware Project (http://www.instrumentmiddleware.org/metadot/index.pl)
[7] Grid Café (http://gridcafe.web.cern.ch/gridcafe/gridatwork/gridatwork.html)
[11] Foster, The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration, 2002
[12] nmi-edit Glossary (http://www.nmi-edit.org/glossary/index.cfm)
[13] GFD Authorization Glossary (http://www.gridforum.org/documents/GFD.42.pdf)
[14] Internet2 Authentication WebISO (http://middleware.internet2.edu/core/authentication.html)
[17] SURA's NMI Case Study Series (http://www.sura.org/programs/nmi_testbed.html#NMI)
[18] Adiga, Henderson, Jokl, et al. "Building a Campus Grid: Concepts and Technologies" (September 2005) (http://www1.sura.org/3000/SURA-AuthNauthZ.pdf)
[19] Adiga, Barzee, Bolet, et al. "Authentication & Authorization in SURAgrid: Concepts and Technologies", (May 2005) (http://www1.sura.org/3000/BldgCampusGrids.pdf)
[20] NEEScentral website (https://central.nees.org/?action=DisplayFacilities)
[21] caBIG Tools, Infrastructure, Datasets (https://cabig.nci.nih.gov/inventory/)

© 2006-8, Southeastern Universities Research Association
Sponsored by SURA, TATRC (No. W81XWH-06-1-0419), OSG, and iVDGL
Updated September, 2007