History, Standards & Directions
Introduction
Most software developers are aware of the role and importance of software standards, especially when attempting to
create a distributed middleware infrastructure, or applications and services that can be reused or inter-operate
with other systems or infrastructure. Standards percolate throughout all aspects of software development, from the
formats of datatypes, on-wire protocols through to design patterns and the architecture of component frameworks.
Without software standards, although development can be quicker, developers can easily create "islands" of software
that work as isolated solutions but will need to be revised, sometimes significantly, if the runtime environment changes.
This chapter aims to give the reader an understanding and status of important current and near-future standards in the
Grid arena. A short history of distributed computing, metacomputing, and the Grid is provided to To frame the discussion,
This history will help put the development of grid standards in perspective and is followed by a review of several
relevant current standards bodies, along with a summary of the standards associated with each. Additional detail is
provided for keycurrent and emerging standards that will have the most impact on the future of the Grid, followed by some
final conclusions.
HistoryEarly Distributed Computing
The history of distributed computing can arguably by traced to 1960, when J.C.R. Licklider suggested "a network of such [computers], connected to one
another by wide band communication lines" which provided "the functions of present-day libraries together with anticipated advances in information
storage and retrieval and [other] symbiotic functions."
[1].
A large amount of work on networking continued after that, leading to the initial
development of the ARPANET, starting in 1969. The goal of these networks was sometimes simply moving data from one machine to another, but
at other times, it consisted of the the more ambitious goal of enabling active processes on multiple machines to communicate with one another. For example,
the 1976 RFC 707
[2]
discussed network-based resource sharing, and proposed the remote procedure call as a mechanism to permit
effective resource sharing across networks.
By the mid 1980s, distributed computing became an active, major field of research, particularly as local, national and international networks
became more ubiquitous. In 1984, John Gage of Sun used the phrase "The Network is the Computer" to describe the idea that the connections between
computer are really what enables systems to be effectively used. In 1985, the Remote-UNIX project
[3]
at the University of Wisconsin created software to capture and exploit idle cycles in computers (also known as "cycle
scavenging") and provided these to the scientific community, who were looking for additional options to solve
computationally-intense problems. This led to the development of
Condor project
[4]
, which is widely used today as distributed middleware. In 1989, the first version of
Parallel Virtual Machine (PVM
[5])
was written at Oak Ridge National Laboratory.
PVM enabled multiple, distributed computers to be used to run a single job. PVM initially was used to link together workstations that were located
in the same general area.
Metacomputing
In 1987, the Corporation for National Research Initiatives (CNRI) suggested a research program in Gigabit Testbeds
[6]
to the NSF. This led
to five five-year projects which started in 1990. Some of these projects were focused on networking, and others on applications, including linking
supercomputers together. The term metacomputing was coined to refer to this idea of multiple computers working together while physically
separated. Larry Smarr, then at NCSA, is generally credited with popularizing this term.
In 1995, the I-Way project
[7]
began. This project worked to integrate previous tools and technologies, such as those aimed at locating and
accessing distributed resources for computation and for storage, and a number of network technologies. I-Way was generally viewed as being successful,
as it deployed a distributed platform containing components at seventeen sites for use by 60 research groups. One key part of the project was
the recognition that having a common software stack (I-Soft) installed on a front-end machine (point-of-presence, or I-POP) at each site was an
effective way of hiding some of the complexity about the individual resources and their locations.
Grid Computing
Globus
In 1996, researchers at Argonne National Laboratory and the University of Southern California started The Globus Project
[8,
9,
10,
11].
The aim of the project was to build on earlier work undertaken in the I-Way project and focus on helping scientists develop
distributed and collaborative applications that make use of the Internet's infrastructure for large-scale problems.
At the heart of the project is the Globus Toolkit, which is developed by the Globus Alliance. It provides a number of services, including those
for resource monitoring and discovery, job submission, security, and data management.
The toolkit has evolved many times since its inception in 1996. Globus versions 1 and 2 had procedural interfaces and were more aimed
towards distributed high-performance applications. The Open Grid Services Architecture (OGSA) that was first announced by the Global Grid
Forum in February 2002, and later declared to be its flagship architecture for the Grid has a significant affect on the Globus Toolkit. OGSA
defines a service-oriented grid architecture. The Globus Alliance produced an incarnation of this architecture in the form of Globus Toolkit version 3 (GT3),
called the Open Grid Service Infrastructure (OGSI), which was first released in July 2003. Critics identified several problems with OGSI, and
consequently in January 2004 Hewlett-Packard, IBM, Fujitsu, and the Globus Alliance announced the WS-Resource Framework (WS-RF).
The Globus Toolkit was refactored again, and in April 2005 version 4 (GT4) of the software was released. In GT4 most of the services provided
are implemented on top of WS-RF, although some are not.
(The Globus Toolkit 3 Programmer's Tutorial provides some additional perspective on this from the Globus team
[82],
including some detail on which services fall into which categories
[83].)
Legion
Legion
[12,
13],
which emerged in late 1993, was an object-based meta-system developed at the University of Virginia. Legion aimed to provide a
software infrastructure so that a system of heterogeneous, geographically distributed, high-performance machines could interact
seamlessly. Legion attempted to provide a user, at their workstation, with a single, coherent, virtual machine. The Legion system
itself was organized by classes and metaclasses and was originally based on Mentat
[14].
In early 1996, Legion received its
first national funding, and the initial prototype was rewritten by November 1997. The system was originally deployed between the
University of Virginia, SDSC, NCSA and UC Berkeley. The system was first demonstrated at Supercomputing in 1997. Legion was
subsequently deployed more widely, including sites in Japan and Europe in what was called NPACI-Net. As Legion was rolled out, various
distributed applications were ported, including those from areas such as materials science, ocean modelling, sequence comparison,
molecular modelling and astronomy.
In 1999, a company called Applied MetaComputing was founded and by 2001 it had raised sufficient venture capital to commercialize
Legion. The company was renamed the AVAKI Corporation, and Legion became Avaki, which was first released as a commercial offering in
September 2001. In 2005, Avaki was purchased by Sybase
[15].
UNICORE
The Uniform Interface to Computing Resources (UNICORE) project
[16]
started in August 1997. The project aimed to seamlessly and securely join a number of German supercomputing centres together
without changing their existing systems or procedures. The UNICORE consortium consisted of developers, supercomputing centres, users and
vendors. The initial UNICORE system had a graphical interface based on Java Applets that was deployed via a Web browser. It also included a
central job scheduler that used Codine from Genias (now Grid Engine
[17],
sponsored by Sun), and a security architecture based on X.509 certificates.
The UNICORE Plus project
[18]
started in January 2000, with two years of funding. The goal of this project was to continue
the development of UNICORE with the aim of producing a grid infrastructure together with a Web portal. It also aimed to harden the software
for production, integrate new services, and deploy the system to more participating sites. The Grid Interoperability Project (GRIP)
[19]
was an overlapping two-year project that started in 2001, and was funded by the European Union, that aimed to realize the interoperability of
UNICORE with the Globus Toolkit, as well as working towards Grid interoperability standards.
Finally, the UniGrids project
[20]
that started in July 2004 is developing Grid services based on OGSA. The goal is to transform
UNICORE into a system with interfaces that are compliant with WS-RF and that can interoperate with other compliant systems.
Standards bodies
The Global Grid Forum (GGF)
http://www.ggf.org/, 2000 — 2006
The Global Grid Forum grew out of a series of conversations, workshops, and Birds of a Feather (BoF) sessions that addressed issues related
to grid computing. The first of these BoFs was held at SC98, the annual conference of the high-performance computing community. That
meeting led to the creation of the Grid Forum, a group of grid developers and users in the U.S who were dedicated to defining and promoting
grid standards and best practices. By the end of 2000, the Grid Forum had merged with the European Grid Forum (eGrid) and the Asia-Pacific
Grid Forum to form the Global Grid Forum. The first Global Grid Forum meeting was held in March 2001. After that, the GGF produced numerous
standards and specifications documents and held world-wide meetings. The GGF merged with the Enterprise Grid Alliance (EGA) to form the
Open Grid Forum (OGF) in June 2006. GGF standards and products have been subsumed into OGF standards.
The Enterprise Grid Alliance (EGA)
http://gridalliance.org/, 2004 — 2006
The EGA was formed in 2004 to focus exclusively on accelerating grid adoption in enterprise data centres. The EGA addressed obstacles that
organizations face in using enterprise grids through open, interoperable solutions and best practices. The alliance published the EGA Reference
Model and Use Cases
[21],
and documents that described Security Requirements
[22]
as well as Data and Storage Provisioning
[23].
The EGA significantly raised awareness worldwide of enterprise grid requirements through effective marketing programs and regional operations
in Europe and Asia. The EGA merged with the GGF to form the OGF. EGA members were primarily vendors and integrators.
The Open Grid Forum (OGF)
http://www.ogf.org/, 2006 —
The OGF was formed by the merger of the GGF and EGA in June 2006. OGF members include vendors, integrators, academic and
government laboratories and programs, and users. It has working groups in a number of areas, including applications,
architecture, compute, data, management, and security.
- Applications work includes an API for submission and control of jobs (drmaa-wg), an API and related services for
checkpointing (gridcpr-wg), an API for grid remote procedure calls (gridrpc-wg), and an API for grid applications (saga-core-wg).
- In architecture, there is general work on the OGSA specification (ogsa-wg) as well as work to create a name space for
OGSA and to produce a WS-Naming naming specification (ogsa-naming-wg).
- Compute work includes discussion of resource management protocols (graap-wg) and grid scheduling (gsa-rg), defining a
language for job submission (jsdl-wg), and work in OGSAm, which includes a specification for a minimal subset of
services (ogsa-bes-wg), a core use case for high-performance computing (ogsa-hpcp-wg), and protocols for scheduling (ogsa-rss-wg).
- In data, work includes a language to describe data files and streams (dfdl-wg), standards for grid data services (dais-wg),
interfaces and an architecture for grid file systems (gfs-wg), storage management functionality (gsm-wg), the gridFTP
protocal(gridftp-wg), an interface for file-like functionality across grids (byteio-wg), interfaces for moving data across
varying protocols (ogsa-dmi-wg), and an overall data architecture under OGSA (ogsa-d-wg).
- Management work includes defining application contents (acs-wg); describing service configuration, deployment, and
lifecycle management (cddlm-wg); defining an accounting service (rus-wg); and defining a record for use in accounting (ur-wg).
- Finally, in security, there is work on defining specifications for interoperability of authorization components (ogsa-authz-wg).
The Organization for the Advancement of Structured Information Standards (OASIS)
http://www.oasis-open.org/, 1993 —
The Organization for the Advancement of Structured Information Standards (OASIS) consortium is non-profit making voluntary
international organization that promotes industry standards for e-business. OASIS was founded in 1993 as SGML Open and changed its
name in 1998 to reflect its expanded technical scope. The consortium produces various Web services standards along with standards
for security, e-business. OASIS has more than 5,000 participants representing over 600 organizations and individual members in 100
countries. The standards include those related to the Extensible Markup Language (XML) and the Universal Description, Discovery, and
Integration (UDDI) service. The Web services standards produced by OASIS focus primarily on higher-level functionality such as security,
authentication, registries, business process execution, and reliable messaging.
The Liberty Alliance
http://www.projectliberty.org/, 2001 —
The Liberty Alliance project is an international coalition of companies, nonprofit groups, and government organizations formed in 2001 to
develop an open standard for federated identity management, which addresses technical, business, and policy challenges surrounding identity
and Web services. The project has the vision of enabling a networked world that is based on open standards where consumers, can easily
conduct online transactions in a private and secure way. The Liberty Alliance has developed the Identity Federation Framework, which enables
identity federation and management and provides interface specifications for personal identity profiles, calendar services, wallet services,
and other specific identity services.
The World Wide Web Consortium (W3C)
http://www.w3.org/, 1994 —
The World Wide Web Consortium (W3C) is an international organization conceived by Tim Berners-Lee in 1994 with the
aims of promoting common and interoperable protocols. The W3C created the first Web services specifications in 2003,
which have evolved through several versions and also become the underlying building blocks for many grid services. The
initial focus was on low-level, core functionality such as SOAP and the Web Services Description Language (WSDLbut the W3C
has since spearheaded many other Web related standards. The W3C has now developed more than 80 technical specifications
for the Web, ranging from XML and HTML to Semantic Web technologies such as the Resource Description Framework (RDF)
and the Web Ontology Language (OWL). W3C members are organizations that typically invest significant resources in Web
technologies. OASIS is a member, and the W3C has partnered with the OGF in the Web services standards area.
The Distributed Management Task Force (DTMF)
http://www.dmtf.org/, 1992 —
The Distributed Management Task Force (DMTF) is an industry-based organization founded in 1992 to develop management
standards and integration technologies for enterprise and Internet environments. The DMTF focuses on developing and unifying management
standards with the aim of enabling a more integrated and cost effective approach to management through interoperable management solutions.
The DMTF has created the Common Information Model (CIM), and also developed communication/control protocols such Web-Based Enterprise
Management (WBEM), the Systems Management Architecture for Server Hardware (SMASH) initiative, and core management services/utilities.
The DMTF formed an alliance with the GGF in 2003 for the purpose of building a unified approach to the provisioning, sharing, and management
of Grid resources and technologies.
The Internet Engineering Task Force (IETF)
http://www.ietf.org/, 1986 —
The Internet Engineering Task Force (IETF) is an open international community of network designers, operators, vendors, and researchers
concerned with the evolution and smooth operation of the Internet. The Globus Alliance has worked with the IETF to produce two RFCs: RFC4462 —
Generic Security Service Application Program Interface (GSS-API) Authentication and Key Exchange for the Secure Shell (SSH) Protocol
[24],
and RFC3820 — Internet X.509 Public Key Infrastructure (PKI) Proxy Certificate Profile
[25].
These are discussed further under the Grid Security Infrastructure (GSI).
The Web Services Interoperability Organization, (WS-I)
http://www.ws-i.org/, 2002 —
The Web Services Interoperability Organization (WS-I) is an open industry body formed in 2002 to promote the adoption of Web services
and interoperability among its different implementations. Its role is to integrate existing standards rather than create new specifications.
WS-I creates, promotes and supports generic protocols for the interoperable exchange of messages between Web services. In order to do
this WS-I publishes profiles that describe in detail which specifications a Web service should adhere to and offer guidance in their proper use.
The overall goal is to provide a set of rules for integrating different service implementations with a minimum
number of features that impede compatibility.
Current standardsWeb Services Specifications and Standards
Web Services
[26,
84]
are loosely coupled platform-independent XML-based applications that operate and communicate within distributed systems.
The core components of the Web Services architecture are SOAP for communications, Web Services Description Language (WSDL)
for describing network services as a set of endpoints operating on messages containing either document- or procedure-oriented
information, and Universal Description Discovery & Integration (UDDI) protocol that defines a set of services supporting the description
and discovery of businesses, organizations, service providers, the services available, and the technical interfaces used to access these services.
SOAP
SOAP Version 1.2
[27]
is an XML-based protocol intended for exchanging structured information in a distributed environment. SOAP uses XML technologies to
define an extensible messaging framework that can be exchanged over a variety of underlying protocols. The framework has been
designed to be independent of any particular programming model and other implementation specific semantics.
The SOAP Version 1.2 specification consists of three parts:
- Part 0 is a document intended to be a tutorial on the features of the SOAP Version 1.2,
- Part 1 is a specification document that defines the SOAP messaging framework,
- Part 2 describes a set of extensions that may be used with the SOAP messaging framework.
Web Services Description Language (WSDL)
WSDL 1.1
[28]
is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or
procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol
and message format to define an endpoint. These related concrete endpoints are combined into abstract endpoints (services). WSDL is
extensible to allow the description of endpoints and their messages regardless of what message formats or network protocols are used
to communicate. However, the only bindings described are in conjunction with SOAP 1.1, HTTP GET/POST, and MIME.
Universal Description Discovery and Integration (UDDI)
The Universal Description Discovery & Integration (UDDI)
[29]
standard defines a set of services that support the description and discovery of:
- Businesses, organizations, and other Web services providers,
- The Web services they make available,
- The technical interfaces which may be used to access those services.
UDDI is based on a set of standards that include HTTP, XML, XML Schema, and SOAP, that provides an infrastructure for a Web
services-based software to be published and searched for either publicly or effectively privately internally within an organization.
WS-RF
WS-RF
[30]
is a set of Web services specifications being developed by the OASIS organization. Taken together and with the WS-Notification
(WSN) specification, these specifications describe how to implement OGSA capabilities using Web services. The purpose of the Web
Services Resource Framework (WS-RF) is to define a generic framework for modelling and accessing persistent resources using
Web services so that the definition and implementation of a service and the integration and management of multiple services is made
easier. WS-RF has a standard approach to extend Web Services. It is based on different standard/recommended WS-* specifications:
- WS-ResourceProperties (WS-RP)
[31]
are the properties of a WS-Resource, which are modeled as XML elements in the resource properties document. A WS-Resource
has zero or more properties expressible in XML, representing a view on the WS-Resource's state.
- WS-ResourceLifetime (WS-RL)
[32]
standardizes the means by which a WS-Resource can be destroyed, monitored and manipulated.
- WS-ServiceGroup (WS-SG)
[33]
defines a means of representing and managing heterogeneous, by-reference, collections of Web services. This specification can be used
to organize collections of WS-Resources, for example aggregate and build services that can perform collective operations on a set of WS-Resources.
- WS-BaseFaults (WSRF-BF)
[34]
defines an XML Schema for base faults, along with rules for how this base fault type is used and extended by Web services.
- WS-Addressing
[35]
provides a mechanism to place the target, source and other important address information directly within a Web services message.
In short, WS-Addressing decouples address information from any specific transport protocol. WS-Addressing provides a mechanism
called an endpoint reference for addressing entities managed by a service.
- WS-Notification
[36,
37]
is a family of documents including three specifications: WS-BaseNotification defines the Web services interfaces
for NotificationProducers and NotificationConsumers; WS-BrokeredNotification defines the Web services interface for the NotificationBroker,
which is an intermediary that among other things, allows the publication of messages from entities that
are not themselves service providers.
- WS-Topics
[38]
defines a mechanism to organize and categorize items of interest for subscription known as "topics."
WS-RF is itself extendable through other WS-* specifications, such as WS-Policy, WS-Security, WS-Transaction, WS-Coordination.
At the 18th Global Grid Forum meeting (September 2006),
discussions were held on the infrastructure to host grid applications that evolved WS-RF to Web Services Resource Transfer (WS-RT). This
evolution is intended to better handle state information that is required for persistent services.
WS-RT
The Web Services Resource Transfer (WS-RT) specification
[39]
was developed jointly by IBM, Hewlett-Packard, Intel, and Microsoft to provide a unified resource access protocol for Web Services.
WS-RT extends WS-Transfer operations, by adding the capability to operate on fragments of management resource representations.
The WS-Transfer specification, which defines standard messages for controlling resources using the familiar paradigms of "get", "put",
"create", and "delete". The extensions primarily deal with fragment-based access to resources to satisfy the common requirements of
WS-RF and WS-Management. The WS-RT specification will form a core component of a unified resource access protocol for the Web services.
The specification intends to meet the following:
- Define a standardized technique for accessing resources using semantics familiar to those in the system management domain, using
get, put, create and delete.
- Define WSDL 1.1 portTypes, that are compliant with WS-I Basic Profile 1.1.
- Describe the minimal requirements for compliance without constraining richer implementations.
- How to compose with other Web service specifications for secure, reliable, transacted message delivery.
- Provide extensibility for more sophisticated and/or currently unanticipated scenarios.
- Support a variety of encoding formats including SOAP 1.1 and SOAP 1.2 envelopes, and others.
Grid Specifications and Standards
Architecture
The OGF describes OGSA
[40,
41,
42,
43]
as representing an evolution towards a Grid system architecture based on Web services concepts and technologies.
Building on both Grid and Web services technologies, OGSI defines mechanisms for creating, managing, and exchanging
information among entities called Grid services. Succinctly, a Grid service is a Web service that conforms to a set of
conventions (interfaces and behaviors) that define how a client interacts with a Grid service. These conventions, and
other OGSI mechanisms associated with Grid service creation and discovery, provide for the controlled, fault-resilient,
and secure management of the distributed and often long-lived state that is commonly required in advanced distributed
applications,
[44,
45],
and focus on technical details, providing a full specification of the behaviors and WSDL interfaces that define a
Grid service. However, some aspects of OGSI (e.g., specification very dense, stateful versus stateless services) create
problems for the convergence of Web services and grid services, and thus have led the community to try again with WS-RF.
The OGSA WS-RF Basic Profile 1.0
[46]
is an OGSA Recommended Profile as Proposed Recommendation as defined in the OGSA Profile Definition
[47] .
The OGSA WS-RF Basic Profile 1.0 describes uses of widely accepted specifications that have been found to enable interoperability.
The specifications considered in this profile are specifically those associated with the addressing, modeling, and management of state: WS-Addressing
[35],
WS-ResourceProperties
[31] ,
WS-ResourceLifetime
[32] ,
WS-BaseNotification
[36] ,
and WS-BaseFaults
[34] .
Scheduling
The interaction between the large variety of complex Grid services expected to exist
will require resource management and scheduling solutions that allow the coordinated use of
the services, something that is currently not readily available. Access to resources is typically subject to individual access, accounting,
priority, and security policies that are imposed by the resource owners. In addition the consideration of different policies is also important
for the implementation of various services, for example accounting or billing services. Generally those policies are enforced by local
management systems. Therefore, an architecture that supports the interaction of independent local management systems with higher-level
scheduling services is an important component for the Grid. Further, a user of a Grid may also establish individual scheduling objectives.
Future Grid scheduling and resource management systems must consider those constraints in the scheduling process.
A scheduling architecture must support the cooperation between different scheduling instances
managing arbitrary Grid resources, including network, software, data, storage, and processing units.
Co-allocation and the reservation of resources will be key aspects of the new scheduling architecture, which will also integrate
user- or provider-defined scheduling policies. The GSA-RG intends to determine the components needed for a generic and modular
scheduling architecture and its interactions. The group has started by creating a dictionary of terms and keywords
[48],
and identifying a set of relevant use cases based on experiences obtained by existing Grid projects
[49].
Resource Management
The Grid, as any computing environment, requires some degree of system management, such as the
management of jobs, security, storage and networks. The management of the Grid is a potentially
complex task given that resources are often heterogeneous, distributed, and cross multiple
management domains.
The OGSA Resource Management document
[50]
contains a discussion of the issues of management that are specific to a Grid and especially to OGSA. It first defines the
terms and describes the management requirement as they relate to a Grid, and then discusses the individual interfaces, services,
and activities that are involved in Grid management, including both management within the Grid and the management of its
infrastructure. It concludes with a gap analysis of the state of manageability in OGSA, primarily identifying Grid-specific
management functionality that is not provided for by emerging distributed management standards. The gap analysis is intended
to serve as a foundation for future work.
System Configuration
Successful realization of the Grid vision of a broadly applicable and adopted framework
for distributed systems integration, virtualization, and management requires the support
for configuring Grid services, their deployment, and managing their lifecycle
[51].
A major part of this framework is a language used to describe the necessary components and systems.
The Configuration Description, Deployment, and Lifecycle Management document
[52]
provides a definition of the CDDLM language that is based on the SmartFrog (Smart Framework for Object Groups) and its
requirements. The CDDLM component model document
[53]
provides a definition of the model and process whereby a Grid resource is configured, instantiated, and destroyed.
The CDDLM API document
[54]
provides the WS-RF-based SOAP API for deploying applications to one or more target computers.
The code that calls the API can upload files to the service implementing the API, then submit a deployment descriptor
for deployment of the application contained in the file.
Data
Three recommendations regarding data access and integration services made by the DIAS-WG (Database Access and Integration Services
Working Group) are currently being considered by the OGF: WS-DAI (core), WS-DAIR (relational data), and WS-DAIX (XML data).
- WS-DAI
[55]
is a specification for a collection of generic data interfaces that can be extended to
support specific kinds of data resources, such as relational databases, XML repositories, object
databases, or files. Related specifications (currently, WS-DAIS and WS-DAIX) define how specific data resources and systems can be
described and manipulated through such extensions. The specifications can be applied in regular
web services environments or as part of a grid fabric.
- WS-DAIR
[56]
is a specification for a collection of data access interfaces for relational
data resources, which extends interfaces defined in the "Web Services Data Access and
Integration" document (WS-DAI). The specification can be applied in regular web services
environments or as part of a grid fabric.
- WS-DAIX
[57]
is a specification for a collection of data access interfaces for XML data
resources, which extends interfaces defined in the Web Services Data Access and Integration
document (WS-DAI). The specification can be applied in regular web services environments or as
part of a grid fabric.
Data Movement
The GridFTP protocol has become a popular data movement tool used to build distributed grid-oriented applications. The GridFTP protocol
extends the FTP protocol by adding certain features designed to improve the performance of data movement over a wide area network,
to allow the application to take advantage of "long fat" communication channels, and to help build distributed data handling applications.
Several groups have developed independent implementations of the GridFTP v1 protocol
[58]
for different types of applications. The experience gained by these groups uncovered several drawbacks of the GridFTP v1 protocol.
Mandrichenko et al
[59]
propose modifications of the protocol to address the majority of the issues found.
Security
The OGSA Security Roadmap
[60]
defines an authorization service that allows services
to make queries and receive responses in regards to access control on grid services. OGSI
authorization services are Grid Services providing authorization functionality over an exposed
Grid Service portType. A client sends a request for an authorization decision to the authorization
service and in return receives an authorization assertion or a decision. A client may be the
resource itself, an agent of the resource, or an initiator or a proxy for an initiator who passes the
assertion on to the resource.
Welch et al
[61]
define a number of use cases for authorization in OGSI covering the possible set of actions that may be attempted against
a Grid Service, as well as how the different existing
authorization services listed previously may be used. From these use cases it derives a set of
requirements for authorization in OGSI.
Grid Security Infrastructure
The goal of the Grid Security Infrastructure (GSI)
[62,
63]
is to allow secure authentication and communication over an open network. The GSI is based on public key encryption and X.509
certificates, and adheres to the Generic Security Service API (GSS-API)
[24],
which is a standard API for security systems promoted by the Internet Engineering Task Force (IETF). Extensions to these
standards have been added for single sign-on and delegation
[25].
GSI provides:
- A public-key system;
- Mutual authentication through digital certificates;
- Credential delegation and single sign-on through proxy certificates.
Emerging standards and specifications
In this section we briefly detail and discuss what we believe to be the most important of the emerging or more established grid-based standards.
It should be borne in mind that the standards that have been included in this section are closely tied to the dominant grid-based applications
being routinely used today.
An unscientific review of grid applications that have been described in recent research papers and publicized on the Web reveals that
current grid usage is dominated by "high throughput" applications, which are mostly "parameter sweep" or "workflow" applications. The former is many
instances of the same application, each with different input data, where the resulting output data is then analyzed. The latter is a
possibly-sophisticated pipeline of processes, "plugged" together to form a chain that can undertake a series of computational tasks
on the original input data set, where a transformed data set is produced. Typically, in both types of applications, some pre-processing is
undertaken to create the parameter sweep or workflow, and then the application is sent off to a software component that schedules and
runs the individual tasks on the back-end grid resources. These applications rely on the ability to both schedule and reserve back-end
resources. A third type of application that is increasingly becoming common is the integration of distributed and heterogeneous databases.
Obviously, each database instance is potentially quite different; each could hold census, medical, geographical, historical, or other records.
Queries across these databases can potentially reveal interesting patterns that provide unique insights. This type of application, where a
user can send off distributed queries to back-end databases, is becoming ever more popular. This application type relies on the
standardization of data access and integration technologies. While, there are many other grid applications, we believe that these three
broad types will be dominant for the immediate future, and therefore they will determine the most important emerging standards and specifications.
OGSA
Using the OGSA model, which proposes a Service-Oriented Architecture (SOA), currently seems to be the best way for the Grid
to become more accepted. A SOA provides an opportunity for almost any provider to supply user services. Moreover it should
enable a grid user to bind together a range of diverse services in a workflow that can undertake the tasks needed by their
application. The high-level architectural view, inherent in OGSA is conceptually important, however, the actual
implementation details of OGSA are crucial, because any SOA cannot be globally successful without well-defined standards.
From WS-RF To WS-RT
Many in the grid community believe that stateful services are an important architectural facet, but this has been perhaps the most contentious
and debated area over the last few years. With the adoption of OGSA, two instantiations of this architecture have appeared: first,
the Open Grid Services Infrastructure (OGSI), and more recently, the Web Services Resource Framework (WS-RF). The former was
dropped for numerous reasons, but mainly because it diverged from normal Web Services tooling, and because it contained too
much in one standard. WS-RF materialized soon after, and seemed to be a better solution, but it appears that this too has now been
dropped in favor of Web Services Resource Transfer (WS-RT). It is unclear, at this moment, why this has occurred, but it is possible
that this move may be more politically motivated than technically motivated. Existing grid middleware, such as Globus will be once again
refactored to use WS-RT, but the effect of yet another change for the community is unclear. One effect of similar changes
in the past has been for the community to either continue to use procedural middleware, such as Globus 2.4, or to resort to using basic
Web Services, and standards SOAP and WSDL.
Registries
In a SOA, a registry is a vital component if clients and services are going to find each other and bind together. Globus originally
used LDAP, but has now moved to an in-memory XML-based registry that supports XPath and XQuery. gLite, the EGEE middleware,
has R-GMA as its registry. This is based on relational database concepts, is non-standard, and has its own data schema.
The Grid community is also using UDDI-based registries. The UDDI standard has changed a lot over the last few years, and OASIS
is currently working on version 4 of the UDDI standard, while common UDDI implementations are based on version 2 of the standard,
which does not meet the needs of the Grid community for a variety of reasons. The only currently successful use of
UDDI for grid purposes is via efforts such as Grimoires
[64],
which has extended UDDI to suit the needs of the Grid community. Other registry standards that may be applicable are starting
to emerge. One example of this is ebXML
[65],
a registry that is capable of storing any type of electronic content such as XML or text documents, images, sound and video.
The ebXMLsoft Registry and Repository
[66]
supports a number of clients, including web browsers, SOAP, Java, and REST
[67]
(Representational State Transfer).
JSDL
There are now many languages for submitting jobs to the Grid; hence interoperability has been difficult if not impossible.
A common language for this purpose is therefore essential. The Job Submission Description Language (JSDL)
[68]
is a declarative language for describing the requirements of job submission. A JSDL document describes the job requirements,
identification information, the application, e.g., executable, arguments, the required resources, e.g., CPUs, memory, and the
input/output files. JSDL does not define a submission interface, what the results of a submission will look like, or how resources
are selected. JSDL 1.0 was published by the OGF as GFD-R-P.56 in November 2005
[69]
and includes a description of JSDL elements and XML Schema. JDSL works with a number of scheduling systems, including
Condor, LSF, Sun's Grid Engine and UNICORE, as well as with UNIX fork.
DRMAA
A key component of the grid is a distributed resource management system: software that queues, dispatches, and controls jobs.
The Distributed Resource Management Application API (DRMAA) working group
[70]
has released the DRMAA specification, which offers a standardized API for application integration with C, Java, and Perl bindings.
DRMAA can be used to interact with batch/job managements systems, local schedulers, queuing systems, and workload management
systems. DRMAA has been implemented in a number of DRM systems, including Sun's Grid Engine, Condor, PBS/Torque, Gridway, gLite, and UNICORE.
SAGA
The Simple API for Grid Applications (SAGA)
[71],
has the potential to become an important specification, due to the
current problems for application developers, which revolves around the rapid rate of change in middleware de facto
standards and APIs, its complexity, and the fact that different middleware exists on different grid systems. SAGA aims to be to
the grid application developer what MPI has been to the developer of parallel application. If SAGA is successful, there will be a surge in
development of new grid applications, the rewriting of some current grid applications to have significantly less code, and the emergence of
libraries written on top of SAGA. SAGA started when a number of projects contemplating similar issues came together in 2004, including
GAT, ReG Steering, and CoG. In October 2006, a draft SAGA-API was released, specified in SIDL (Scientific Interface Definition
Language), which is object-oriented and language-neutral. If the promise of SAGA can be delivered, a stable period for application
development will follow, similar to that delivered by MPI in the parallel computing arena over the last 15 to 20 years.
GridFTP
An important feature of a distributed environment is the movement of various types of data between remote components. Data
movement can include data staging, copying an executable to a remote platform, inter-application communications, or copying output
data back to the user of an application. As noted earlier in the section on data movement in Grid Specifications and Standards, the
GridFTP protocol has become popular for moving data in distributed grid-oriented
applications. GridFTP extends FTP, as defined by RFC959 and other IETF documents, by adding features
such as multi-streamed transfer, auto-tuning and grid-based security.
Workflow
Workflow-based technologies can be found almost everywhere; they can be found embedded in a range of development tools,
network applications and Web services. There are many grid-based system too, ranging from those that support SOAs, such as Kepler
[72]
and Taverna
[73],
to those that support applications specific middleware such as Globus, Condor, GridAnt
[74],
and Pegasus
[75].
Even though workflow standards seem to be everywhere, they have not bridged the gap to broad adoption.
Data Access and Integration
There is a need for middleware to assist with the access to and integration of data from separate sources that are distributed
over the Grid. The OGF Database Access and Integration Services Work Group (DAIS-WG)
[76]
is working toward standards in this area. Two important standards are emerging, OGSA-DAI
[77],
middleware that allows data resources such as relational or XML databases to be accessed via Web Services, and the Distributed
Query Processing (DQP) system, known as OGSA-DSP
[78],
that allows efficient queries across these distributed data resources.
Summary and conclusions
Standards of all types are crucial if the vision of the Grid is to be fully realized. There are a large number of both standards bodies and
standards that impact and define today's Grid. Some standards are built on one another, and some standards oppose each other.
(The recent roadmap on WS
[79]
from HP, IBM, Intel, and Microsoft may be a sign that there will be fewer competing specifications in the future.) There are a number
of generalizations that can be made about standards processes, and almost all of them, both positive and negative, apply to
the standards on which the Grid is based. Some of the problems in the current Grid standards are:
- Vested interest and potential intransigence on the part of some major players who are defining standards,
- Lack of involvement from other key players,
- Changing road maps of related standards,
- General politics.
The effect of all these problems is the delay of the overall standards process, which in turn distresses developers who then have to
make design choices based on those standards that are currently available. This causes developers to use multiple alternatives, which
reduces the acceptance of the later-released standards. There are many precedents where well-developed standards have not
been taken-up, at least partially due to their late emergence, such as OSI
[80]
and HPF
[81].
Bibliography[1] J. C. R. Licklider, "Man-Computer Symbiosis," IRE Trans. on Human Factors in Electronics, v. HFE-1, pp. 4--11, Mar. 1960
[2] http://tools.ietf.org/html/rfc707
(http://tools.ietf.org/html/rfc707) [3] M. J. Litzkow, "Remote UNIX — Turning Idle Workstations into Cycle Servers," Proc. of USENIX, pp. 381--384, Sum. 1987
[4] M. Litzkow, M. Livny, M. Mutka, "Condor — A Hunter of Idle Workstations," Proc. of 8th Int. Conf. of Dist. Comp. Sys., pp. 104--111, Jun. 1988
[5] V. S. Sunderam, "PVM: A Framework for Parallel Distributed Computing," Concurrency: Prac. and Exp., v. 2(4), pp. 315--339, Dec. 1990
[6] Gigabit Testbed Initiative Final Report, 1996.
(http://www.cnri.reston.va.us/gigafr/) [7] I. Foster, J. Geisler, W. Nickless, W. Smith, S. Tuecke, "Software Infrastructure for the I-WAY High Performance Distributed Computing Experiment," Proc. 5th IEEE Symposium on High Performance Distributed Computing, pp. 562--571, 1997.
[8] The Globus Project
(http://www.globus.org/) [9] The Globus Alliance
(http://www.globus.org/alliance/) [10] I. Foster, C. Kesselman, S. Tuecke, "The Anatomy of the Grid: Enabling Scalable Virtual Organizations," Lecture Notes in Computer Science, v. 2150, 2001.
[11] I. Foster, C. Kesselman, J. Nick, S. Tuecke "The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration, 2002.
[12] Legion
(http://legion.virginia.edu/) [13] Grimshaw, A. S., Wulf, W. A., "The Legion Vision of a Worldwide Virtual Computer," Comm. of the ACM, v. 40(1), January 1997.
[14] The Mentat project
(http://www.cs.virginia.edu/~mentat/) [15] Sybase Avaki EII
(http://www.sybase.com/products/developmentintegration/avakieii) [16] UNICORE
(http://www.unicore.org/) [17] Grid Engine open source project website
(http://www.sun.com/software/gridware/) [18] UNICORE Plus
(http://www.fz-juelich.de/unicoreplus/) [19] GRIP
(http://www.fz-juelich.de/zam/cooperations/grip) [20] UniGrids
(http://www.unigrids.org/) [21] Enterprise Grid Alliance, "EGA Reference Model and Use Cases v1.5"
(http://www.gridalliance.org/en/WorkGroups/ReferenceModel.asp) [22] Enterprise Grid Alliance,"EGA Grid Security Requirements v1.0"
(http://www.gridalliance.org/en/WorkGroups/GridSecurity.asp) [23] Enterprise Grid Alliance, "Enterprise Data and Storage Provisioning Problem Statement and Approach,"
(http://www.gridalliance.org/en/WorkGroups/DataandStorageProvisioningRequirements.asp) [24] Jeffrey Hutzelman, Jospeh Salowey, Joseph Galbraith, and Von Welch, "RFC4462: Generic Security Service Application Program Interface (GSS-API) Authentication and Key Exchange for the Secure Shell (SSH) Protocol," In RFC4462, Internet Engineering Task Force, 2006.
[25] S. Tuecke, V. Welch, D. Engert, L. Perlman, M. Thompson, "RFC3820: Internet X.509 Public Key Infrastructure (PKI) Proxy Certificate Profile," In RFC3820, Internet Engineering Task Force, 2004.
[26] Web Services
(http://www.w3.org/2002/ws/) [27] SOAP version 1.2
(http://www.w3.org/TR/2002/WD-soap12-part0-20020626/) [28] WSDL
(http://www.w3.org/TR/wsdl) [29] UDDI
(http://uddi.org/pubs/uddi_v3.htm#_Toc85907967) [30] WS-RF Primer,
(http://docs.oasis-open.org/wsrf/wsrf-primer-1.2-primer-cd-02.pdf) [31] WS-ResourceProperties (WS-RP)
(http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceProperties-1.2-draft-06.pdf) [32] WS-ResourceLifetime (WS-RL)
(http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-03.pdf) [33] WS-ServiceGroup (WS-SG)
(http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ServiceGroup-1.2-draft-02.pdf) [34] WS-BaseFaults (WS-BF)
(http://docs.oasis-open.org/wsrf/wsrf-ws_base_faults-1.2-spec-pr-01.pdf) [35] WS-Addressing
(http://www.w3.org/Submission/ws-addressing/) [36] WS-BaseNotification, March 2004
(ftp://www6.software.ibm.com/software/developer/library/ws-notification/WS-BaseN.pdf) [37] WS-BrokeredNotification, March 2004
(ftp://www6.software.ibm.com/software/developer/library/ws-notification/WS-BrokeredN.pdf) [38] WS-Topics, March 2004
(ftp://www6.software.ibm.com/software/developer/library/ws-notification/WS-Topics.pdf) [39] Web Services Resource Transfer (WS-RT)
(http://devresource.hp.com/drc/specifications/wsrt/WS-ResourceTransfer-v1.pdf) [40] I. Foster, C. Kesselman, J. M. Nick, S. Tuecke, "The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration"
(http://www.globus.org/alliance/publications/papers/ogsa.pdf) [41] I. Foster, H. Kishimoto, A. Savva, D. Berry, A. Djaoui, A. Grimshaw, B. Horn, F. Maciel, F. Siebenlist, R. Subramaniam, J. Treadwell, J. Von Reich, "The Open Grid Services Architecture, Version 1.0"
(http://www.gridforum.org/documents/GWD-I-E/GFD-I.030.pdf) [42] I. Foster, D. Gannon, H. Kishimoto, J. J. Von Reich, "Open Grid Services Architecture Use Cases"
(ttp://www.gridforum.org/documents/GWD-I-E/GFD-I.029v2.pdf) [43] H.Kishimoto, J. Treadwell,"Defining the Grid: A Roadmap for OGSA\texttrademark\ Standards: Version 1.0"
(http://www.ogf.org/documents/GFD.53.pdf) [44] S. Tuecke, K. Czajkoski, I. Foster, J. Frey, S. Graham, C. Kesselman, T. Maguire, T. Sandholm, D. Snelling, P. Vanderbilt, "Open Grid Services Infrastructure (OGSI): Version 1.0"
(http://www.ogf.org/documents/GFD.15.pdf) [45] Open Grid Service Infrastructure Primer
(http://tinyurl.com/yss7tp) [46] I. Foster, T. Maguire, D. Snelling, "OGSA WS-RF Basic Profile 1.0"
(http://www.ogf.org/documents/GFD.72.pdf) [47] T. Maguire, D. Snelling, "OGSA Profile Definition Version 1.0"
(http://www.ogf.org/documents/GFD.59.pdf) [48] M. Roehrig, M. Ziegler, "Grid Scheduling Dictionary of Terms and Keywords"
(http://www.ogf.org/documents/GFD.11.pdf) [49] R. Yahyapour, P. Wieder,"Grid Scheduling Use Cases"
(http://www.ogf.org/documents/GFD.64.pdf) [50] F. B. Maciel, "Resource Management in OGSA"
(http://www.ogf.org/documents/GFD.45.pdf) [51] D. Bell, T. Kojo, P. Goldsack, S. Loughran, D. Milojicic, S. Schaefer, J. Tatemura, P. Toft, "Configuration Description, Deployment, and Lifecycle Management (CDDLM) Foundation Document"
(http://www.ogf.org/documents/GFD.50.pdf) [52] P. Goldsack, "Configuration Description, Deployment, and Lifecycle Management: SmartFrog-Based Language Specification"
(http://www.ogf.org/documents/GFD.51.pdf) [53] S. Schaefer, "Configuration Description, Deployment, and Lifecycle Management: Component Model: Version 1.0"
(http://www.ogf.org/documents/GFD.65.pdf) [54] S. Loughran, "Configuration Description, Deployment, and Lifecycle Management: CDDLM Deployment API"
(http://www.ogf.org/documents/GFD.69.pdf) [55] M. Antonioletti, M. Atkinson, A. Krause, S. Laws, S. Malaika, N. W. Paton, D. Pearson, G. Riccardi, "Web Services Data Access and Integration — The Core (WS-DAI) Specification, Version 1.0"
(http://www.ogf.org/documents/GFD.74.pdf) [56] M. Antonioletti, B. Collins, A. Krause, S. Laws, J. Magowan, S. Malaika, N. W. Paton, "Web Services Data Access and Integration ∆ The Relational Realisation (WS-DAIR) Specification, Version 1.0"
(http://www.ogf.org/documents/GFD.76.pdf) [57] M. Antonioletti, S. Hastings, A. Krause, S. Langella, S. Lynden, S. Laws, S. Malaika, N. W. Paton, "Web Services Data Access and Integration ∆ The XML Realization (WS-DAIX) Specification, Version 1.0"
(http://www.ogf.org/documents/GFD.75.pdf) [58] W. Allcock, J. Bester, J. Bresnahan, S. Meder, P. Plaszczak, S. Tuecke,"GridFTP: Protocol Extensions to FTP for the Grid"
(http://www.ogf.org/documents/GFD.20.pdf) [59] I. Mandrichenko, W. Allcock, T. Perelmutov,"GridFTP v2 Protocol Description"
(http://www.ogf.org/documents/GFD.47.pdf) [60] R. Siebenlist, V. Welch, S. Tuecke, I. Foster N. Nagaratnam, P. Janson, J. Dayka, A. Nadalin, "OGSA Security Roadmap (Draft)"
(http://www.cs.virginia.edu/~humphrey/ogsa-sec-wg/ogsa-sec-roadmap-v13.pdf) [61] V. Welch, F. Siebenlist, D. Chadwick, S. Meder, L. Pearlman, "OGSA Authorization Requirement"
(http://www.ogf.org/documents/GFD.67.pdf) [62] GSI Working Group
(https://forge.gridforum.org/projects/gsi-wg) [63] I. Foster, C. Kesselman, G. Tsudik, S. Tuecke, "A Security Architecture for Computational Grids," Proc. 5th ACM Conference on Computer and Communications Security Conference, pp. 83--92, 1998.
[64] Grimoires
(http://www.ecs.soton.ac.uk/research/projects/grimoires) [65] ebXML Registry Services Specification v2.5
(http://www.oasis-open.org/committees/regrep/documents/2.5/specs/ebrs-2.5.pdf) [66] ebXMLsoft Registry and Repository
(http://www.ebxmlsoft.com/) [67] REST
(http://en.wikipedia.org/wiki/REST) [68] JDSL
(https://forge.gridforum.org/projects/jsdl-wg/) [69] JSDL-doc
(http://www.gridforum.org/documents/GFD.56.pdf) [70] DRMAA
(http://www.drmaa.org) [71] SAGA
(http://www.ogf.org/gf/group_info/view.php?group=saga-rg) [72] I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludaescher, S. Mock,"Kepler: An Extensible System for Design and Execution of Scientific Workflows," Proc. of 16th Int. Conf. on Sci. and Statistical Database Management (SSDBMÕ04), pp. 423--424, 2004
[73] T. Oinn, M. Addis, J. Ferris, D. Marvin, M. Senger, M. Greenwood, T. Carver, K. Glover, M. R. Pocock, A. Wipat, P. Li, "Taverna: A Tool for the Composition and Enactment of Bioinformatics Workflows," Bioinformatics J., v. 20(17), pp. 3045--3054, 2004
[74] K. Amin, G. vonLaszewski, "GridAnt: A Grid Workflow System,"Argonne National Laboratory, Feb 2003
[75] E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S.Patil, M. Su, K. Vahi, M. Livny, "Pegasus: Mapping Scientific Workflows onto the Grid," Across Grids Conference 2004
[76] DAIS-WG
(https://forge.gridforum.org/projects/dais-wg) [77] OGSA-DAI
(http://www.ogsadai.org.uk) [78] OGSA-DQP
(http://www.ogsadai.org.uk/about/ogsa-dqp/) [79] K. Cline, J. Cohen, D. Davis, D. F. Ferguson, H. Kreger, R. McCollum, B. Murray, I. Robinson, J. Schlimmer, J. Shewchuk, V. Tewari, W. Vambenepe, "Toward Converging Web Service Standards for Resources, Events, and Management"
(http://download.boulder.ibm.com/ibmdl/pub/software/dw/webservices/Harmonization_Roadmap.pdf) [80] ISO standard 7498-1, 1994
(http://standards.iso.org/ittf/PubliclyAvailableStandards/s020269_ISO_IEC_7498-1_1994(E).zip) [81] High Performance Fortran standards
(http://hpff.rice.edu/versions/) [82] Globus Toolkit 3 Programmer's Tutorial, Key Concepts: WSRF & GT4
(http://gdp.globus.org/gt3-tutorial/multiplehtml/ch01s05.html) [83] Globus Toolkit 3 Programmer's Tutorial, Key Concepts: OGSA, WSRF, and GT4
(http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s01.html) [84] Web Services Standards as of Q1 2007
(http://www.innoq.com/resources/ws-standards-poster/)
|