Home |  Previous |  Next |  Print |  Contact

 Current Technology for Grids

  
 Acknowledgments
 Preface
 Introduction
 History, Standards & Directions
 What Grids Can Do For You
 Grid Case Studies
 Current Technology for Grids
 
 An overview of grid fabric
 User interface
 Access management
 Resource registration, discovery, and management
 Data management
 Job scheduling and management
 Administration and monitoring
 Metascheduling
 Account management and reporting
 Shared filesystems
 Workflow processing
 Bibliography
 Programming Concepts & Challenges
 Joining a Grid: Procedures & Examples
 Typical Usage Examples
 Related Topics
 My Favorite Tips
 Glossary
 Appendices
 Use of This Material
 

Current Technology for Grids


An overview of grid fabric

A grid requires a minimum set of basic services to function properly and be distinguishable from other forms of distributed computing. Though the particular needs of the community that will utilize the grid may prescribe additional or more detailed functionality, the following basic grid services provide a commonly useful foundation:

  • User interface
  • Access management (authentication and authorization)
  • Resource discovery and management
  • Data management
  • Job scheduling and management
  • Grid administration
  • Monitoring

Several other grid services are desirable, though not necessary, and still relatively immature to the list above, given the current landscape of grid standards and products reflecting those standards. Among these are meta-scheduling, (coordination of job scheduling and submission across resources grid-wide), user account management and reporting, shared file systems and workflow management.

Even as grid standards are still being defined, there are already many products available for implementing a grid today. Considering this, any given grid is partially defined by the functionality, focus and features of the product(s) that are used to implement it — a computing versus data grid, for instance, or scheduled versus opportunistic use of resources. The sections below provide a bit more detail on each of the basic grid services and provide examples of products commonly in use today. In particular, several functions are discussed within the context of the Globus Toolkit [1], which is an open source product that has been available for many years and has become a dominant product for assembling and managing resources in a grid, particularly among the academic community.


User interface

Resources on a grid remain accessible at an individual system level and can therefore be accessed and used through remote login to a user account. This is still a popular access method within a grid environment, particularly for researchers who already use computers in their work and are very familiar and comfortable with this type of access. Many of these researchers are often reluctant to change even if more "user friendly" options are made available to them. Once a grid user is logged in, grid commands can be entered and executed alongside other system commands, respective of the permission parameters of the user account, and no learning curve is necessary beyond an understanding of basic grid commands. Remote login, however is arguably not a new access method, nor one that requires use of grid technologies. For instance, users have to decide where to run their jobs, and track progress themselves. For users who are not as well versed in command line access, or would prefer more automated functionality, graphical user interfaces, such as web-based grid portals can provide a less cryptic, more customized and often more efficient user experience.

A grid portal can be defined as a web-based interface that provides users with access to grid resources and services via a standard Web browser. By leveraging the Web environment and technologies, grid functions such as resource discovery, job submission, and monitoring can be combined in a portal with other useful features such as documentation, collaboration tools and "MyPortal" style customization for group or individual user views. Although grid portals are typically designed to meet the needs of specific projects or communities, the resulting functionality is often similar. Today, portals are most useful for introducing people to the grid and for running and managing small to moderate numbers of jobs. A grid portal may, however, be a more difficult method for running and tracking very large numbers of jobs, which is a necessity for some grid users.

Initially, grid portals were also designed and implemented using quite different approaches in terms of their architecture and programming. This made it difficult if not impossible to reuse or leverage components to speed the development of similar or subsequent portals. Today, the JSR168 specification [2] serves as a standard to guide portal design and implementation. JSR 168: Portlet Specification v1.0 defines three major portal components [3]:

PLT.2.1 What is a Portal?
A portal is a web based application that commonly provides personalization, single sign on, content aggregation from different sources and hosts the presentation layer of Information Systems. Aggregation is the action of integrating content from different sources within a web page. A portal may have sophisticated personalization features to provide customized content to users. Portal pages may have different set of portlets creating content for different users.

PLT.2.2 What is a Portlet?
A portlet is a Java technology based web component, managed by a portlet container that processes requests and generates dynamic content. Portlets are used by portals as pluggable user interface components that provide a presentation layer to Information Systems.

PLT.2.3 What is a Portlet Container?
A portlet container runs portlets and provides them with the required runtime environment. A portlet container contains portlets and manages their lifecycle. It also provides persistent storage for portlet preferences. A portlet container receives requests from the portal to execute requests on the portlets hosted by it. A portal and a portlet container can be built together as a single component of an application suite or as two separate components of a portal application.

PLT.2.4 An Example 5
The following is a typical sequence of events, initiated when users access their portal page:

  • A client (e.g., a web browser) after being authenticated makes an HTTP request to the portal.
  • The request is received by the portal.
  • The portal determines if the request contains an action targeted to any of the portlets associated with the portal page.
  • If there is an action targeted to a portlet, the portal requests the portlet container to invoke the portlet to process the action.
  • A portal invokes portlets, through the portlet container, to obtain content fragments that can be included in the resulting portal page.
  • The portal aggregates the output of the portlets in the portal page and sends the portal page back to the client.

Although a well-designed portal can significantly enhance the accessibility of grid computing, particularly for non-technical users, even this improved user experience most often requires that users be aware of specific details of the available resources and make educated decisions in selecting a specific computational resource for their job. This problem is complicated by the fact that different users of the same portal may see the same set of resources but be authorized to use different subsets of resources, or have access to the same resources but with differing authorization levels. Effective grid usage today often requires users to be aware of which resources they are authorized to use and also explicitly check each resource's current operational status before picking one and submitting their job to it. A user would prefer to simply have their job run on whatever resource or combination of resources would ensure the best performance, a problem that can be solved through more full-featured portals, improved system monitoring and reporting, and intelligent metascheduling.


Access management

As discussed in the earlier section "Who can use grid resources?" users must be both authenticated and authorized to access grid resources. Approaches to this vary greatly across grid-building products, especially if academic, government and commercial sectors are all considered. PKI (public key infrastructure) is becoming an authentication technology of choice for many government uses — both grid and non-grid based — and is also the basis for authentication within Globus, which is heavily used by the academic sector. Globus GSI (Grid Security Infrastructure) relies on PKI and its related exchange of certificates, including proxy certificates, for authentication, and provides for authorization through a "grid-mapfile" that is used to associate properly authenticated users with individual system accounts. Grid users obtain an acceptable certificate through a Certificate Authority (CA) that meets the operational standards and level of assurance (LoA) of the particular grid environment they are trying working within. A grid initiative may set up its own CA for this purpose, or use certificates from an existing CA that is compatible in practice and intent. Warning: some people do not realize how much work it can be to run a CA. Before deciding to do so, it is worthwhile to investigate use of existing CAs and also to talk to others who run their own CAs to learn more about the requirements.

Though the primary need in each grid initiative is to manage access to resources and applications within its own environment, grid-to-grid integration is rapidly becoming a high priority and is a prerequisite to creating a global grid infrastructure similar in pervasiveness to the Internet. Development of interoperable PKI "fabric" for grids worldwide is coming about through the efforts of the IGTF (International Grid Trust Federation [4]. This effort is complemented by a growing recognition that mechanisms being developed for inter-institutional sharing — via a grid or otherwise — should be compatible with middleware for identity management currently under development for the higher education community, and in collaboration with the federal government, through groups such as EDUCAUSE and Internet2 (e.g. HEBCA [5], FEBCA [6], USHER [7]. The goal is the availability of secure and authoritative campus-issued credentials that enable researchers to use their local identity within and beyond the institution instead of managing multiple credentials for different projects and environments.

There is also much effort underway in the grid and middleware communities to build and enhance tools for managing virtual organizations and augmenting the Globus toolkit so that it can make more direct use of emerging security assertion-based mechanisms for authentication and authorization decisions. These mechanisms can merge the more traditional virtual organization concepts of first authenticating the user and then looking up attributes that determine what the user is allowed to do into one process that uses other backend infrastructure to deliver signed assertions specifying a user's role and/or what they are allowed to do. These technologies become even more interesting when you consider that this type of technology is also becoming widely deployed in the community for non-grid purposes. It's safe to say, however, that no single grid initiative has yet found a universally useful and deployable solution in this area.

Access management and security is a complex topic and components providing these features vary greatly depending on the product(s) used to build any given grid. Some examples of security components available within a Globus grid are:

Overview of Globus security components [8]

Web Services Authentication and Authorization [9] provides message level security through the WS-Security standard and the WS-SecureConversation specification, Transport-level security (TLS) support, and an Authorization Framework that allows a number of different authorization schemes.

In a pre-Web services mode of grid operation, Common Authorization Service (CAS) provides a means for virtual organizations to express policies covering distributed resources across multiple sites: GT 4.0: Security: Pre-Web Services Authentication and Authorization [10].

To delegate a single credential to be shared across multiple invocations of services on a hosting environment, Globus provides a Delegation Service. This can be used, for example, for multiple GRAM job submissions or Reliable File Transfer (RFT) submissions.

For setting up a Certificate Authority, the Globus Toolkit bundles in the SimpleCA package [11], designed by the VPN Consortium and based on OpenSSL Certificate Authority software.

MyProxy is a popular package for credential management that is widely used in grid environments and should be a serious consideration for any new grid. See http://grid.ncsa.uiuc.edu/myproxy/ [12] or http://www.globus.org/toolkit/docs/4.0/security/myproxy/ [13].

Additional software packages also work with MyProxy. For instance, the Grid Account Management Architecture (GAMA [14]) adds account management capability.

Virtual Organization Membership Service (VOMS [42]) is a service that provides authorization for users within virtual organizations, using concepts of membership and roles. It is currently maintained through the Enabling Grid for E-SciencE (EGEE) project and is in large-scale use by the Open Science Grid (OSG).

Portal-Based User Registration Service (PURSE [45]) is an integrated solution that combines the SimpleCA software and MyProxy components with a back-end database and an easy to use web portal to automate user registration.


Resource registration, discovery, and management

Resource discovery and management is a necessity in a grid environment in order to determine information about which resources can be allocated for a given grid job. A resource management service can test the conditions of the allocated grid resources, launch the job if all of the conditions are met, and report back on what happened — possibly under conditions where real-time interaction with the user is impractical (e.g., remote location, time difference).

The Globus Toolkit provides a framework for discovery and management of grid resources that comprises grid services and libraries as well as a highly standards-based security subsystem that addresses message protection, authentication, delegation and authorization. Developers can use the software, services and libraries provided to build and customize a grid environment that meets the requirements of a targeted user community. Those implementing grids on behalf of particular user communities should review the goals of the intended grid environment in order to determine which components will be desirable and required.

The current version of Globus (as of November 2006) is GT 4.0.3, which is based on industry-standard Web services protocols and mechanisms. To accommodate established grids while migrating to Web Services (WS), GT 4.x versions support "legacy" components from prior versions: pre-WS Grid Resource Allocation and Management (GRAM) for execution management, pre-Monitor and Discovery System (MDS) for information services, and pre-WS Authentication and Authorization (AA).

The Globus Toolkit may be downloaded, built, installed and configured from source or a binary installer version may be used. The Globus Toolkit 4.0 Admin Guide provides comprehensive documentation covering all options of the toolkit, pre-requisite software required, environment variables that need to be set, etc. as well as information on migrating from older toolkit versions. Application Programming Interface (API) documentation is available for C and Java.


Data management

Data movement and management are required to provide reliable access to stored data that is used or created by compute resources. The amount of data to be manipulated may be huge, depending on the particular application. Data transfer may occur under several scenarios:

  • Autonomously — independent of any particular submitted job (e.g., ad hoc file transfer or a scheduled data transfer via dedicated network shares or data grids).
  • Staging — manually uploading the data to the clusters to ensure that data is available when and where it is needed for a particular job.
  • As a result of computation, conditional on the outcome of the computation.
  • During a computation, as an intermediate stage of the computation "data pathways".

Data management overall is a complex topic and the development of grids that are optimized for the handling of distributed data is an evolving area of research, even as basic services are being developed and deployed. Some proprietary commercial approaches are available (e.g., Avaki [43]), as well as open-source (e.g., Globus components for grid data management [44]). We hope to expand on this important area of grid development and use in future versions of the Cookbook.


Job scheduling and management

Job submission by end-users requires some method to define job parameters such as the location path of software or data, the chosen set of computational resources, any conditional execution or triggers/blocks, and any required authentication/authorization information. These collective details form the job description that is used to queue the job for execution on appropriate resources. Workload management systems, also called Distributed Resource Management systems (DRMs), provide resource management for jobs that are submitted to run on any given resource ("local scheduling" or use of resources at a single site versus grid-wide, or meta-scheduling.)

Workload management systems are available commercially as well as via open-source. High Performance Computing vendors generally prefer or recommend a workload management system for their products but other workload management systems are available. Some of the most well-known workload management systems include:

  • Load Sharing Facility (LSF) [15], a commercial system from Platform computing
  • Load Leveler (LL) [16], developed by IBM for their systems
  • SUN Grid Engine (SGE) [17], available commercially from Sun Microsystems and also contributed by them in an open-source version.
  • The Portable Batch System (PBS) from Altair [18], available in open-source and commercial versions, with the commercial version, PBS Pro, also available at no charge to degree-granting universities.
  • Condor [19] is a batch job system that can take advantage of both dedicated and non-dedicated computers to run jobs. It focuses on high-throughput rather than high-performance, and provides a wide variety of features including checkpointing, transparent process migration, remote I/O, parallel programming with MPI, the ability to run large workflows, and more. Condor-G is designed to interact specifically with Globus and can provide a resource selection service to different and multiple grid sites.

SGE and LSF both have a foundation in the Codine Distributed Queuing System. LL has been a product of IBM for a number of years. PBS was developed by the NAS Division of the NASA Ames Research Center in the early to mid-90's, specifically for parallel systems, including cluster systems. Condor is developed by the University of Wisconsin Madison.

Each of these workload management systems offers a wide range of configuration options. Most are designed primarily for time-sharing but offer some level of space sharing configuration options. (PBS is based on space sharing but offers time-sharing options also.) What works best for any given site in terms of functionality, configuration and even policy (ability to implement policy with the technology) can vary and user requirements should be gathered to determine a best fit as part of any new installation.

A site often uses the same workload management system for all of their resources, but workload management systems in use across a grid often vary. Ideally, which workload management system is ultimately used to submit a job should be transparent to the grid user. In addition to providing a suite of web services to submit, monitor and cancel jobs in a grid environment, Globus provides interface support for several common workload management systems and directions for developing an alternative interface to shield the user from system-specific detail.


Administration and monitoring

Grid administration tools give the administrator the sense of localized control of resources even though the grid resources may not be geographically near the administrator. Grid administration tools today are mostly for controlling authorization and authentication, however, ideally, they will evolve to model the richness and functionality of those that have evolved for local workstation system administration.

Monitoring the state of grid resources, services and job activity is an important part of managing a grid environment. It is important for grid administrators to know the current state of the grid to provide operations and support but it also an important tool for grid users. Prior to job submission, job monitoring can provide grid users with important information about what resources are accessible via the grid and the existing workloads on each. Once a job is submitted, grid job monitoring becomes a necessity for keeping track of job progress and results. Grid job monitors gather vital information about job submissions on specific resources by harvesting data from local cluster job managers such as PBS, LSF, and Ganglia. Resource allocation is also facilitated by the use of grid monitoring, which enables grid services on the various resources to be dynamically instantiated and adjusted using constantly running background processes (daemons). In Globus, examples of these background processes include Grid Resource Allocation & Management (GRAM) for job submission, a Grid Resource Information Service (GRIS) that maintains information on software and hardware configuration for a specific node, and a Grid Index Information Service (GIIS) that aggregates GRIS information for a collection of nodes.

The Globus component for grid monitoring is the Monitoring and Discovery System (MDS) [21]. The latest version of Globus, GT4, includes Web-services-based components such as WS-Resource Properties, WS-BaseNotification and WS-ServiceGroup and provides WebMDS as an interface for accessing lower level services. Information provided may come from DRM systems of the grid resources, other Globus services such as GRAM, RFT or RLS, or cluster/system monitors such as Ganglia [22], Nagios [23], or Inca [24].

Monitoring Agents in A Large Integrated Services Architecture (MonALISA [25]) provides a distributed service for monitoring, control and global optimization of complex systems. MonALISA is based on a scalable Dynamic Distributed Services Architecture (DDSA) implemented using Java / JINI and Web Services technologies. The scalability of the system derives from the use of a multi-threaded execution engine to host a variety of loosely coupled, self describing, dynamic services or agents, and the ability of each service to register itself in order to be discovered and used by other services, or clients that require such information.


Metascheduling

Metaschedulers operate at the grid level across potentially numerous resources, gathering and analyzing information from local schedulers in order to assign user jobs to the most suitable resources at any given time. As resources are added to a grid, basic information about the grid resource is provided to metascheduler to establish ongoing communication for more effective scheduling of grid resources. Implementing a metascheduler is an advanced use of the grid and somewhat of a moving target since the design and development of metaschedulers is an active area of grid technology research and development. A metascheduler operating in conjunction with a portal, however, can significantly improve both the usefulness and efficiency of the grid.

Designing and building metaschedulers is an active field of grid research concurrent with implementation and a variety of diverse approaches are in use or being explored:

Community Scheduler Framework (CSF) is WSRF-compliant and built upon the Globus Toolkit. CSF is WSRF compliant and built upon the Globus Toolkit. A grid user may use CSF to submit jobs, create advanced reservations and define preferred scheduling policies at the grid level to access different workload managers. CSF "meta-schedules" jobs between the job management system queues.

GridWay is an open source meta-scheduler and included in the Globus Incubator [27], a program for new projects to eventually become part of Globus. GridWay enables large-scale, secure, reliable sharing of compute resources across multiple systems that may be using various workload management systems, such as PBS, SGE, LSF, Condor or others.

As noted in the earlier section on workload management systems, Condor-G [28] is a workload management product that can also work with other DRMs to provide overall dynamic job management.

United Devices provides HPC Synergy [29], as a commercial solution for optimizing an organization's existing compute resources from desktops to servers and clusters to create an on-demand environment. Synergy works with other workload management systems including LL, Condor, Open PBS and PBS Pro, LSF and SGE.

Another commercial offering is the Moab Grid Suite from Cluster Resources [30].

The EGEE Workload Manager Service is under use and development within the gLite [31] (Lightweight Middleware for Grid Computing) project.

MARS [32] is a provisioning and workflow architecture being developed by the University of Michigan.


Account management and reporting

Grid-centric account management and reporting is still in its infancy. It currently relies for the most part on local accounting data available from the different grid resources, aggregated as much as possible through pre-packaged accounting software, or through "home-grown" code and scripts. Ideal grid accounting should give grid users and administrators feedback on the resources used by various groups and users grid-wide. This is very important for users, contributors and other stakeholders to understand the impact and extent of grid usage, and also to support and verify policy implementation such as fair scheduling and prioritizing of future jobs based upon previous use. Products are emerging to better meet the needs of grid-wide account management and reporting but more comprehensive and standards-based packages are still needed for a true "meta-view" of a grid.

Grid-wide account management and reporting begins at the level of the individual grid user account. As discussed in previous sections, grid user accounts include validation at a local source, across trusted hosts to use specific resources, software, and data. User authentication and authorization are important components. Following up with management tasks, such as creating accounts and enforcing usage and file quotas, are normal requirements at the local level that should also be viewed at the grid level. And the logical next step is to report that use in various ways and to various people such as the owners of resources (particularly when they want to know how much they use versus how much is used across the grid at large) and to the sponsors who grant the funds.

While many of us are familiar with the UNIX account management and logging tools, management across the grid goes well beyond their capabilities. A number of systems are developing to accomplish these tasks. We will summarize two, showing some of their features and components.

Accounting, Approaches in use

User accounts

In a grid environment today, it is likely the case that accounts will be created for users both local to the site and remote from the site. (We assume that any policies needed to distinguish their use have already been addressed.) In general:

  • Each grid site appoints a grid administrator that is authorized to use a centralized authentication and authorization system.
  • The grid administrator creates and maintains grid accounts via the centralized authentication and authorization system. This system maintains a grid [LDAP] directory.
  • Under the Globus scheme, the local Unix administrator creates and maintains standard Unix user accounts and home directories on all systems. PKI Subject Distinguished Names (DNs) are then mapped to these local Unix user accounts. These mappings are maintained in a grid-mapfile on each Globus gatekeeper. (Some grid projects provide tools for automating this process.)
  • Certificates are issued to each user and a globus subdirectory is created in the user home directory in which to keep the certificate. (Remote users use the certificate credentials issued by their home site.)
  • Grid accounts are mapped to local accounts.
  • Password synchronization is done as needed. (In some cases authentication is done via certificates, but in other cases passwords may be needed and synchronization may be provided for via local tools.)

Accounting of use

Several grid-wide accounting packages are capable of meeting the needs of a large-scale grid today.

  • Gratia is software developed for the Open Science Grid to collect accounting information. From the OSG Gratia twiki page [33] "The Grid Accounting Project has:
    • designed the schema for the accounting attributes,
    • is ensuring the necessary collectors and sensors are in place in the resource providers,
    • has defined and is deploying repository and access tools for the reporting and analysis of the grid wide accounting information."
  • The SweGrid Accounting System [34] (SGAS) is a Java implementation of a resource allocation enforcement and tracking service , based on the latest Web services technologies. SGAS is a soft-state, non-intrusive Grid accounting solution that includes logging and tracking in GGF Usage Record XML format and a remote and scriptable management interface.


Shared filesystems

The appearance and utility of a single file system across grid resources would be arguably be the most effective means for accessing and staging necessary data, libraries and executable within grids jobs, as well as managing and accessing job output. As with metascheduling, this is an area of active research and development and in is infancy in terms of implementation. As a forerunner and potential model for a grid-wide file system, many high performance computer systems, clusters in particular, use the Networked File System (NFS) to create and share a single file system across multiple compute systems. Examples include:

  • Parallel Virtual File System [35] (PVFS) is a popular open source solution as a high-performance and scalable parallel file system for clusters that requires no special hardware or kernel modifications. PVFS capabilities include a consistent file name space across compute systems, transparent access for existing utilities, and physical distribution of data across multiple disks in multiple clusters, and a high-performance user space access for applications.
  • Global Parallel File Space (GPFS)-Wide Area Network (WAN) is another high performance parallel file system that can span systems across a wide area network. An example of GPFS-WAN in use can be found on the TeraGrid.
  • Gfarm [36] from the Asia Pacific Grid (ApGrid) Grid Data Farm project is a next-generation network-shared file system that is recommended for data farms as well as clusters.
  • Lustre is a popular commercial solution designed and developed by Cluster File System, Inc [37].


Workflow processing

A workflow can be thought of as a set of tasks with dependencies. Tasks that are part of a typical grid workflow include access management, discovery and movement of data, and job execution(s). Dependencies that are attached to such tasks may range from evaluation of particular user characteristics (appropriate assurances of authentication or authorization), availability and control of data, availability and control of resources. Defining a "grid job" at the level of workflow instead of job submission helps realize the benefits of grid technology at the level of an overall user problem or inquiry versus discrete operations

Some popular software packages available for defining and managing workflows include:

The Directed Acyclic Graph Manager [38] (DAGMan), available with Condor. Once dependencies are identified, DAGMan manages these automatically between Condor jobs.

Globus Community Scheduler Framework [40] (CSF) is actually a meta-scheduler but is sometimes included in workflow services. CSF is WSRF compliant and built upon the Globus Toolkit. A grid user may use CSF to submit jobs, create advanced reservations and define preferred scheduling policies at the grid level to access different workload managers.

Pegasus [41], from the University of California's Information Sciences Institute (ISI) is a flexible framework that enables the mapping of complex scientific workflows in a grid environment. Pegasus takes an XM-based abstract workflow as input and intelligently decides how to run the workflow on a grid.


Bibliography

[1] Globus Toolkit (http://www.globus.org)
[2] JSR168 specification (http://jcp.org/aboutJava/communityprocess/final/jsr168/index.html)
[3] JSR 168: Portlet Specification v1.0, Major Portal Components (http://tinyurl.com/324qrg)
[4] International Grid Trust Federation (http://www.igtf.org)
[5] HEBCA (http://www.educause.edu/HigherEducationBridgeCertificationAuthority/623)
[6] FEBCA (http://www.cio.gov/fbca/)
[7] USHER (http://www.usherca.org/)
[8] Overview of Globus security components (http://www.globus.org/grid_software/security/)
[9] Web Services Authentication and Authorization (http://www.globus.org/grid_software/security/ws-aa.php)
[10] GT 4.0: Security: Pre-Web Services Authentication and Authorization (http://www.globus.org/toolkit/docs/4.0/security/prewsaa)
[11] SimpleCA (http://www.vpnc.org/SimpleCA)
[12] NCSA MyProxy Credential Management Service (http://grid.ncsa.uiuc.edu/myproxy/)
[13] GT 4.0: Credential Management: MyProxy (http://www.globus.org/toolkit/docs/4.0/security/myproxy/)
[14] Grid Account Management Architecture (GAMA) (http://grid-devel.sdsc.edu/gridsphere/gridsphere?cid=gama)
[15] Load Sharing Facility (http://www.platform.com/Products/Platform)
[16] Load Leveler (http://www-306.ibm.com/software/tivoli/products/scheduler-loadleveler)
[17] SUN Grid Engine (http://www.sun.com/software/gridware)
[18] Altair Engineering, Inc. (http://www.altair.com/software/pbspro.htm)
[19] Condor Project (http://www.cs.wisc.edu/condor)
[20] NSF Middleware Initiative Grids Center software distribution (http://www.grids-center.org/)
[21] Monitoring and Discovery System (http://www.globus.org/toolkit/mds)
[22] Ganglia (http://ganglia.sourceforge.net/)
[23] Nagios (http://www.nagios.org)
[24] Inca (http://inca.sdsc.edu)
[25] MonALISA (http://monalisa.cacr.caltech.edu/monalisa.htm)
[27] Globus Incubator (http://dev.globus.org/wiki/Incubator/Incubator_Management)
[28] Condor-G (http://www.cs.wisc.edu/condor/condorg/)
[29] HPC Synergy (http://www.ud.com/products/hpcsynergy.phpa)
[30] Cluster Resources (http://www.clusterresources.com/pages/products/moab-grid-suite.php%20)
[31] gLite (http://glite.web.cern.ch/glite/wms/)
[32] MARS (http://www-personal.engin.umich.edu/%7Eabose/website/marshome.htm)
[33] Gratia twiki page (https://twiki.grid.iu.edu/twiki/bin/view/Accounting/WebHome)
[34] SweGrid Accounting System (http://www.sgas.se/)
[35] Parallel Virtual File System (http://www.pvfs.org/index.html)
[36] Gfarm (http://datafarm.apgrid.org)
[37] Cluster File System, Inc (http://www.clusterfs.com)
[38] Condor Directed Acyclic Graph Manager (http://www.cs.wisc.edu/condor/dagman/)
[40] Globus Community Scheduler Framework (http://www.globus.org/grid_software/computation/csf.php)
[41] Pegasus (http://pegasus.isi.edu)
[42] VOMS: Virtual Organization Membership Service (http://www.globus.org/grid_software/security/voms.php)
[43] Avaki EII (http://www.sybase.com:80/products/allproductsa-z/avakieii)
[44] Globus Data Management: Key Concepts (http://www.globus.org/toolkit/docs/4.0/data/key/)
[45] Globus PURSE: Portal-based User Registration Service (http://www.globus.org/grid_software/security/purse.php)

© 2006-8, Southeastern Universities Research Association
Sponsored by SURA, TATRC (No. W81XWH-06-1-0419), OSG, and iVDGL
Updated September, 2007