Home |  Previous |  Next |  Print |  Contact

 Typical Usage Examples

  
 Acknowledgments
 Preface
 Introduction
 History, Standards & Directions
 What Grids Can Do For You
 Grid Case Studies
 Current Technology for Grids
 Programming Concepts & Challenges
 Joining a Grid: Procedures & Examples
 Typical Usage Examples
 
 Job Submission on SURAgrid: Multiple Genome Alignment
 SCOOP (SURA Coastal Ocean Observing & Prediction) Demonstration Portal
 Job Submission: Bio-electric Simulator for Whole Body Tissue
 Bibliography
 Related Topics
 My Favorite Tips
 Glossary
 Appendices
 Use of This Material
 

Typical Usage Examples

The experience of using a grid can vary quite a bit from grid to grid given the possible variations in areas such as the user interface (e.g., command line, web portal, through an application), grid middleware (e.g., by grid product, Web services vs. pre-Web services), available applications and connected resources. As grid usage increases and diversifies, commonalities in the user experience are likely to emerge, similar to the way that basic personal computer skills currently transfer from one platform to another. The aspects most likely to homogenize over time can be previewed through large scale, multi-purpose grids today, which strive to develop portals that provide customized "MyPortal" views but efficient reuse of underlying functional components. The following examples from SURAgrid provide merely a glimpse of this range, with each having a different purpose and being intended for a specific user community. In future versions of the Cookbook, we look to expand this section with more examples of the variety of applications possible and the diversity of application environments.


Job Submission on SURAgrid: Multiple Genome Alignment

In this example we show the steps that a researcher uses to submit a job through the SURAgrid portal for the Multiple Genome Alignment application at Georgia State University (GSU). This application takes multiple genome sequences as input and gives an aligned sequence based on structure. This is done using a memory efficient pair-wise alignment algorithm and parallelized code that can run on a grid.

As you will notice in this example, several levels of authentication are required to reach the grid, file storage, and compute resources. Our first step is to authenticate with (log into) SURAgrid. This is done by using a typical login window.


Figure EX-1. Login window.

Our next step is to get a proxy from the proxy server MyProxy for the machine within the grid that holds our source data. Here we specify the grid resource (banderas.tacc), the port to use (7512), who we are (a e neuman [1]), and how long we want this access (2 hours). Upon clicking the "Get Proxy from MyProxy" button


Figure EX-2. Proxy request window.

the portal returns with information about our proxy.


Figure EX-3. Proxy response window.

Note the grid options in the tabs across the top. We can progress toward job submission via each of these tabs. Next we move our source data files over to the grid resource on which the computation is to be performed. Under the File Management tab, we find a secure copy tool. First we specify source machine (banderas), destination machine (mileva), and the relevant directories. SURAgrid responds with the directory contents menus.


Figure EX-4. Directory contents window.

At this point we select which files to move over to mileva. Now we're ready to submit the job via the Job Submission tab.


Figure EX-5. Job submission window.

When we submit the job, a "job handle" (underlined in the figure below) is returned to us so we might track the job via the Job Status window under the Job Submission tab.


Figure EX-6. Job status window.

From this vantage point we can track the job via the Status column, compare it with progress of other jobs on the grid, Cancel the job, or Delete it completely.

When the job completes (is DONE), we can retrieve our output files


Figure EX-7. Directory contents window.

similarly via the File Management tab. We can now look at our output file.


Figure EX-8. Job output window.

For more information on using or joining SURAgrid, see About SURAgrid [2].

 


SCOOP (SURA Coastal Ocean Observing & Prediction) Demonstration Portal

The following example is a prototype of a distributed national laboratory for coastal research and operations, as being developed by the SCOOP program (scoop.sura.org.) SCOOP overall is focused on numerical modeling, real-time data exchange and 24/7 operational prediction and visualization for storm surge, wind, waves, and surface currents, with special attention to predicting and visualizing phenomena that cause damage and inundation of coastal regions during severe storm and hurricanes.

In this section we show some shots of the SCOOP Gridsphere demonstration portal as of Fall 2006. The portal is web accessible and we start with a login page.


Figure EX-10. Login window.

Once we log in, we're presented with a number of tools across the menu bar. Under the Start Test tab we are presented with various models and tests that we can run on the grid. Here we get ready to rerun the demo that was constructed for the SC06 conference.


Figure EX-11. Running a demo.

From here we move to the Resource Monitoring page and check to see which resources are available for our use. We see that 5 machines are in the grid and up to 124 CPUs are available. Network bandwidth information is also available.


Figure EX-12. Resource information.

We can also get a graphical image of this information using the Resource Portlet. Note the color/shape coded status information available in this view.


Figure Ex 13. SCOOP resource portlet.

We can monitor our job's progress and compare that progress with other activities on the grid through the Job Monitoring tab. We can quickly see status via the color coded Status ball. Start time and CPU usage of other jobs can also assist us in determining when our job is likely to complete.


Figure Ex 14. Job status window.

We can monitor the flow of various data types within the models and across the computer systems within the grid.


Figure EX-15. Data transport information.

Finally, we can use the portal's visualization tools to help us understand the results.


Figure EX-16. Visualization of results.

For more information, see the SURA Coastal Ocean Observing and Prediction (SCOOP) Program [9] web page.


Job Submission: Bio-electric Simulator for Whole Body Tissue

The following application, run on SURAgrid, from Old Dominion University is designed to simulate the response of a "whole body tissue" model to potential/current stimulus through direct electrode contact and uses a command line interface for access. While web API's are rapidly developing (requiring more work but with the potential for much improved job management), some people still prefer command line interfaces. Those who are familiar with using command line options to submit jobs to a cluster will notice that the differences are not significant. Command line access is also often useful in development and initial debugging of grid applications. The ODU project team sees their current version of BioSim as their first stage of running on a grid, which has already enabled them to make significant headway on their initial goal of testing the scalability of BioSim on much larger compute clusters then would be available locally. In the next stage, the research team will modify BioSim to use the SURAgrid portal for data and job management, eventually adding automatic selection of the best available resources and dynamic job control.

The first step is to log in to a local resource and generate proxy credentials. This is done with a call to grid-proxy-init. Further information is provided by the grid-proxy-info command.


Figure EX-17. Local login window (using Putty terminal application).


Figure EX-18. Proxy credential creation.

The system returns with a proxy that is valid for 12 hours. A call to gsissh completes the connection. (GSI-OpenSSH is a modified version of OpenSSH that adds support for GSI authentication and credential forwarding [delegation], providing a single sign-on remote login and file transfer service.)


Figure EX-19. GSI-OpenSSH login window.

A Grid Information Service (GIS — not to be confused with GSI!) provides resource discovery and monitoring services for a grid. The shell command grid-info-search performs searches on a GIS server based on search filters that conform to LDAP searches, returning information for compute resources based on search criteria provided on the command line.


Figure EX-20. Compute resource information via grid-info-search.

A fairly broad search includes job manager information


Figure EX-21. Job manager information.

and queue information.


Figure Ex 22. Queue information.

The Globus Monitoring and Discovery Service (MDS) shows us a PBS job manager on host milewa.hpc.odu.edu. The queue name is "batch" which includes 4 nodes, no maximum CPU time, and so forth.

We can take a further look at the nodes


Figure EX-23. Compute node detail.

including their state and some fairly specific hardware information.

Now its time to submit a job. This grid is using the Portable Batch System (PBS) as its scheduler and therefore a PBS script is prepared to describe the job. This script is in file simple.pbs.


Figure EX-24. PBS job submission script.

This job's name is simplePbsTest and will use 2 processors on each of 4 nodes. It will run 15 minutes and standard output will go to simplePbsTest.o<JobID> and standard error will go to simplePbsTest.e<JobID>. The executable is in the current directory and is named HelloWorld.exe. Based on the use of mpiexec, we know this application has been developed using the MPI library, which handles the program and data distribution and communication across the nodes.

The PBS qsub command submits the job to the batch queue.


Figure EX-25. Job submission information.

The Job ID "5235" can be used to track the job. For example, the PBS qstat command will show us the status of the job as it progresses through the machine and/or grid.


Figure EX-26. Job status information.

At this point the job has not yet started but is queued to run.

Once the job runs, we can look at the output file, simplePbsTest.o5235.


Figure EX-27. Job output file.

(Aha! Yet another use of the classic "Hello World" example!) At this point the output file is still on the grid's file (gridftp) server. The globus-url-copy command provides the ability to transfer files to, from, or between gridftp servers. In this example, the output file has been transferred from mileva to a file BioSimTestRun1.txt in directory /tmp on the local workstation.


Figure EX-28. Transferring files back home.

Lastly, what if we realize a job we submitted includes errors and we want to delete it? The PBS qdel command will take care of that.


Figure EX-29. Deleting a job.

A followup qstat shows there are no more jobs running or queued for our user.


Bibliography

[1] Alfred E Neuman (http://www.answers.com/topic/alfred-e-neuman)
[2] About SURAGrid (http://www.sura.org/programs/sura_grid.html)
[3] SCOOP institutions (http://violet.itsc.uah.edu:8080/gridsphere/gridsphere?cid=partners)
[4] Office of Naval Research/ (http://www.onr.navy.mil/)
[5] NOAA's Coastal Services Center (http://www.csc.noaa.gov/)
[6] U.S. Ocean Action Plan/ (http://ocean.ceq.gov/actionplan.pdf)
[7] Global Earth Observation System of Systems (http://www.epa.gov/geoss/)
[8] Integrated Earth Observing System (http://www.noaa.gov/lautenbacher/oceanology.htm)
[9] SURA Coastal Ocean Observing and Prediction (SCOOP) Program< (http://scoop.sura.org/)

© 2006-8, Southeastern Universities Research Association
Sponsored by SURA, TATRC (No. W81XWH-06-1-0419), OSG, and iVDGL
Updated September, 2007