Compiling MPIJobs

ALERT! WARNING!
The contents of this twiki page are outdated, and are kept for historical purposes. Please exercise caution prior to starting a new project based on this information. Instead, consider whether High-Throughput Parallel Computing might work for you instead.

Picking A Site

The first step is to pick an OSG site that supports your VO and has an MPI version installed. The easiest way to do this is decide on an MPI implementation (MPICH, MPICH2, OpenMPI?, etc.) and perform an LDAP query. For example, to see all of the MPICH versions installed on Purdue's Steele cluster, the following command can be used:

tg-steele$ ldapsearch -x -l 60 -b mds-vo-name=Purdue-Steele,mds-vo-name=local,o=grid -h is.grid.iu.edu -p 2170 "GlueSoftwareName=MPICH" GlueSoftwareEnvironmentSetup

The "GlueSoftwareEnvironmentSetup" field shows what command to use in order to load the MPI version into your environment. This is important for compiling and running the application.

Compiling MPI Jobs

Compiling the application can be done in one of two ways. If the site allows local logins, you can login directly to the server, run the command to load MPI into your environment, and compile your application. This is essentially the same workflow you would use to run the application on your own machine.

Since most sites don't allow local logins, however, another way to compile your application is to submit a batch job that will compile the application for you. This can be accomplished by writing a short script that will source the module and compile the application. The following is an example script that will compile the cpi program.

A Sample Compile Script

#!/bin/bash
#
# Right now on Purdue's Steele cluster, the modules program is not in the user's
# path when a job is run. In order to ensure the module command works, we need
# to source the module setup script. For sites using softenv, sourcing
# /etc/profile.d/softenv.sh should work instead.

source /etc/profile.d/modules.sh
# This is where the command from GlueSoftwareEnvironmentSetup goes
module load mpich-gcc

mpicc -o cpi cpi.c

A Sample Submit Script

Once you have a script that will compile your application, you can submit a Condor-G job to compile your application. For example, the following submit script will compile and return the working executable for the above compile script on Purdue's Steele cluster:

Universe = grid
Grid_Resource = gt2 lepton.rcac.purdue.edu/jobmanager-pbs

Executable = compile.sh

Output = compile_job.out
Log    = compile_job.log
Error  = compile_job.error
WhenToTransferOutput = ON_EXIT
Transfer_Input_Files = cpi.c,compile.sh
Transfer_Output_Files = cpi

Queue 

Once the Condor-G job ends, you'll most likely have to make your program executable by running:

chmod +x 

Running your application

Now that you've compiled the executable, it's time to run the job. The following RSL attributes need to be included in your Condor-G job:

  • jobType=mpi
  • handle=MODULE_NAME
  • directory=JOB_DIRECTORY

Here is an example submit script for a simple "cpi" application:

# file: cpi.submit
Universe = grid
Grid_type= gt2

# Set Scheduler to the proper resource
GlobusScheduler = lepton.rcac.purdue.edu/jobmanager-pbs

# Make sure to set handle to the appropriate software module
GlobusRSL = (jobType=mpi)(handle=mpich2-gcc)(xcount=2)(host_xcount=1)(directory=/home/ba01/u100/ahoward/testjobs/cpi)

Executable = /home/ba01/u100/ahoward/testjobs/cpi/cpi
Arguments = -i 10000 

Stream_output = False
Stream_error = False

WhenToTransferOutput = ON_EXIT
TransferExecutable = False

Output = cpi.out
Error = cpi.err
Log = cpi.log

Notification = NEVER
Queue

Finally, the job can be started by running:

tg-steele$ grid-proxy-init
tg-steele$ condor_submit cpi.submit


Backlinks


Twiki topics in Documentation web containing an "INCLUDE" of this page:
Section Topic Last Updated by
Number of topics: 0

Twiki topics in all others webs containing an "INCLUDE" of this page:

Section Topic Last Updated by

All references to this document in the Documentation web only
All references to this document in all webs

Child Topics

Immediate children of this topic include the following:

    Major Updates

    -- AndrewHoward - 04 Dec 2008

    Topic revision: r3 - 17 Feb 2010 - 20:05:30 - BrianBockelman
    Hello, TWikiGuest!
    Register

     
    TWIKI.NET

    TWiki | Report Bugs | Privacy Policy

    This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..