High Throughput Parallel Computing

What is HTPC and why is it important?

High Throughput Parallel Computing (HTPC) is a computational paradigm for an emerging class of applications where large ensembles (hundreds to thousands) of modestly parallel (4- to ~64- way) jobs are used to solve scientific problems ranging from chemistry, biophysics, weather and flood modeling, to general relativity. Effectively supporting this paradigm requires focus in at least three areas:

  • Solving the parallel job portability problem to enable parallel jobs to easily run on heterogeneous resources
  • Optimizing parallel jobs to effectively utilize modern multi-core technologies
  • Effectively distributing HTPC jobs across suitable distributed resources

Press Release: Bringing High Throughput Capabilities to Ensembles of Parallel Applications
HTPC presentation at Condor Week April 2010

Parallel jobs, in general, are not very portable, and as a result are difficult to run across multiple heterogeneous sites. In the Open Science Grid (OSG) framework we are currently working on minimizing these barriers for an important class of modestly parallel (4- to ~64- way) jobs where the parallelism can be executed on a single multi-core machine. Currently, eight-way multicore machines are prevalent in the OSG. Most local schedulers have mechanisms, exposed via RSL, to schedule a single job to run exclusively on one multi-core machine. By using shared memory on a single machine as an MPI interconnect, and bringing along with the job all the software infrastructure required to run this "local" parallel job, a user can easily create a portable parallel job that can run on many different OSG sites. Another important benefit is that this strategy helps to optimize multi-core machine utilization while simplifying the build and submit process thereby making HTPC accessible to a wider class of scientific users.

Early adopter HTPC sites

  • Oklahoma-Sooner (Henry Neeman, Horst Severini, the OU team) was the first external site to begin running HTPC Jobs. Gratia Reports for OU HTPC jobs.
  • Purdue-Lepton (Preston Smith, Fengping Hu)
  • Clemson (Sam Hoover)
  • Nebraska-Firefly (Brian Bockelman)
  • UC San Diego (Frank Wurthwein, Terrence Martin, Igor Sfiligoi)
  • UW GLOW (Dan Bradley)

Setting up an HTPC job on the OSG

Currently we are testing HTPC jobs with MPI parallelism so the directions below are primarily directed at MPI users. However, since the parallel libraries are packaged together with the application, it is straightforward to use other libraries such as OpenMP?, Linda, or others.

Compiling an HTPC job

The main advantage of HTPC is that most any MPI implementation that supports shared memory can be used. MPICH2 and OpenMPI? have both been tested. To compile an HTPC job, simply compile as usual on any accessible machine. This need not be a head node, or even an OSG resource. However, this machine does need to be an OS and architecture compatible with the OSG sites where the job will run. Currently, Scientific Linux 4 or a 32 or 64 bit machine is a good universal donor system, using OpenMPI? for the MPI implementation. Another significant benefit is that it is easy to test these jobs locally, without waiting for scheduling delays. Statically linking the binary can ensure that it doesn't depend on any shared libraries unavailable on the target system.

Submitting an HTPC job

There are two main tricks to submitting an HTPC job. The first is transfering the mpiexec command along with the job itself. As the job will be transfered as a "data" file, the main executable in the condor-g file will need to be a wrapper script, which simply sets the executable bit on the MPI job proper, and calls mpiexec.

For example, a wrapper script to run an 8-way HTPC job might look like

#!/bin/sh

chmod 0755 ./mdrun ./mpiexec

./mpiexec -np 8 ./mdrun some_input_file

The second trick is requesting that the local scheduler allocate all the cores on a single machine for your job. In PBS, the way to do this is with the RSL in a Condor-G submit file:

GlobusRSL = (xcount=8)

Batch Scheduler Setup

Enabling multi-core jobs requires some site-specific configuration of the local batch system.

PBS

PBS should work out of the box by using the "xcount" option supported by the PBS jobmanager.

Condor

Condor requires whole machines to be configured. Click here for directions

LSF

LSF requires some changes to the job manager to allow the -x option to be passed to the LSF bsub command. Add the following lines to the LSF job manager, in the section where options are being parsed:

   if(defined($description->exclusive())) 
   {
   print JOB "#BSUB -x\n";
   }

HTPC Schema Attributes

The following attributes will be added to the Glue schema as a CECapability:

Attribute Name Attribute Type Description
GlueCECapability string htpc
HTPCrsl string extra rsl needed to enable HTPC jobs
HTPCAccessControlBaseRule string ACBR format to specify one or more of VO: or VOMS:

Another useful variable for HTPC is the "number of cores per machine" but this can be calculated from:

  • number of cores per machine = GlueHostArchitectureSMPSize * (LogicalCPUs / PhysicalCPUs).

The following attributes need to be included in the CE section of the "config.ini" file.

htpc = enabled
htpc_queues = queue1, queue2, queue3 # can also take "*"
htpc_blacklist_queues =
htpc_rsl = (foo=bar) # this is the default for HTPC queues
htpc_rsl_queue1 = (zork=quux) # this is a specialized rsl for queue1 only
In the future if we need to extend this to support an alternative CE such as CREAM, we can add new lines similar to the last two with CREAM syntax.

Running Amber9 PMEMD

RENCI's Trash/Engagement team is running Amber9's PMEMD, a molecular dynamics tool, in the HTPC model.

This site provides details of the goals, approach and status of the work.

HTPC Getting Started Guide

Initial Testing and Debugging

While you can certainly submit a real production parallel job to test any given site, it is often best to submit a small shell script to validate that any particular site is working correctly. There are two common failure modes with HTPC:

  1. Jobs simply do not run and remain permanently idle. It can sometimes be difficult to differentiate between a failure of this type and a site that is simply busy. If all of your jobs are still idle after 24 hours, it is useful to contact the site administrator to see what the story is.
  2. A second kind of problem is somewhat more tricky. It has happened that a site is configured to accept HTPC jobs, but due to a configuration error, does not allocate a whole machine for your HTPC job. Rather, it allocates one slot or core, in the usual way. If you job tries to take advantage of multiple cores, or the whole memory on the machine, it may run correctly, but very very slowly. Because of certain assumptions in many MPI stacks, an 8 way MPI jobs running on a single core will usually run much than 1/8th the expected speed. To detect this, it is useful to run the following simple shell script as a test job for each site you'd like to run on. This script utilizes several different tests to help diagnose possible failure modes.

#!/bin/sh

echo "----------------------------------------------------------------"

sleep 60

/bin/hostname

/usr/bin/uptime

/bin/uname -a

cat /proc/cpuinfo

ps auxwwr
echo "----------------------------------------------------------------"

exit 0

Once the script runs, examine the output as follows:

  1. The first thing this script does is to just sleep for a minute to let the load average stabilize. Then it runs the "uptime" command. Ideally, the result of the short-term "uptime" should be close to 0. If the short-term "uptime" result is closer to the number of cores, that is a strong indication that HTPC is not correctly configured, and this HTPC job is really running like a serial job.
  2. The number of cores on this machine is encoded in the output of /proc/cpuinfo. This should show the number of cores that is expected.
  3. Finally, the "r" option to the "ps" command shows all Unix processes in the runnable state, another clue to whether HTPC is enabled. Ideally, only this "ps" command line should appear in the output. If the output of "ps" shows other user jobs in the Runnable state, that's a sure sign that HTPC is not properly enabled on this machine.

In order to test submissions to OSG sites that and verify htpc support, you can download and use this [[https://twiki.grid.iu.edu/twiki/pub/Documentation/HighThroughputParallelComputing/condor_test_htpc_grid.submit]condor-g submit file]]. You'll need to change the location and name of the htpc test script and the resource name as well as the output files to match your parameters.

HTPC and Glide-ins

To submit jobs via Glide-ins changes must be incorporated into the Submit Host as well as into the Glide-in Factory.

  • Changes required to the pilot (or Glide-in) factory for htpc:
    • For each htpc-site the RS" line in the glideinWMS.xml file must be edited to look like:
      • entry name="clemson-htpc" enabled="True" gatekeeper="osg-gw.clemson.edu/jobmanager-condor" gridtype="gt2" rsl="(condorsubmit=('+RequiresWholeMachine' TRUE))" schedd_name="submit.chtc.wisc.edu" verbosity="std" work_dir="."
      • This ensures that pilot jobs are started on the whole machine.

Glide-in site configurations:

Gatekeeper Site Name RSL
osg-gw.clemson.edu/jobmanager-condor clemson-htpc rsl="(condorsubmit=('+RequiresWholeMachine' TRUE)('Requirements' 'CAN_RUN_WHOLE_MACHINE=?=TRUE'))"
red.unl.edu/jobmanager-condor nebraska-red-htpc rsl="(condorsubmit=('+RequiresWholeMachine' TRUE)('Requirements' 'CAN_RUN_WHOLE_MACHINE=?=TRUE'))"
lepton.rcac.purdue.edu/jobmanager-pbs purdue-htpc rsl="(jobtype=single)(queue=tg_workq)(xcount=8)(host_xcount=1)(maxWallTime=2800)"
osg-gw-2.t2.ucsd.edu/jobmanager-condor ucsd-htpc rsl = "(condorsubmit=('+RequiresWholeMachine' TRUE))"
grid1.oscer.ou.edu/jobmanager-lsf ou-htpc rsl = "(jobtype=single)(exclusive=1)(maxWallTime=2800)"
pf-grid.unl.edu/jobmanager-condor nebraska-prairiefire-htpc rsl = "(condorsubmit=('+RequiresWholeMachine' TRUE)('Requirements' 'CAN_RUN_WHOLE_MACHINE=?=TRUE && Cpus >= 8'))"
smufarm.physics.smu.edu/jobmanager-condor smu_phy-htpc rsl = "(condorsubmit=('+RequiresWholeMachine' TRUE)('Requirements' 'CAN_RUN_WHOLE_MACHINE=?=TRUE'))"
brgw1.renci.org/jobmanager-pbs renci-htpc rsl="(jobtype=single)(xcount=2)(host_xcount=1)(maxWallTime=2800)"

GPU Queues - Glide-in site configurations:

Gatekeeper Site Name RSL
brgw1.renci.org/jobmanager-pbs renci-htpc rsl="(jobtype=single)(xcount=2)(host_xcount=1)(queue=gpgpu)(maxWallTime=2800)"

Comments

Topic attachments
I Attachment Action Size Date Who Comment
elsesubmit condor_test_htpc_grid.submit manage 0.4 K 22 Apr 2011 - 19:48 HorstSeverini condor submit file for htpc script
Topic revision: r47 - 06 Dec 2016 - 18:12:38 - KyleGross
Hello, TWikiGuest!
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..