Using Worker Node Client

Note: This page refers to the ldapsearch command. The ldapsearch command is not part of the standard OSG installation, it is commonly on many standard linux installations. If you do not have it and would like to get it from the VDT it is part of the VDT OpenLDAP? package that can be pulled using pacman and refer to the VDT Package 'OpenLDAP.'

Introduction

This document is written for end users. It describes how to initialize the environment of your job to correctly access the execution and data areas on the worker node.

ALERT! WARNING!
Storage definitions and implementations may not be consistent across OSG sites.

Whats in wn-client?

You can see directly whats in the wn-client package for this release, ITB 0.9.2:

Basically, it is:

  1. Our set of CA certificates.
  2. Basic VDT underlying infrastructure (like tools to install the VDT tarballs, keep track of files, etc.)
  3. Globus client tools. This is probably slightly more than we need: it includes tools like:
    • proxy management (create proxy, proxy info, destroy proxy)
    • job submission (pre-WS and WS)
    • data transfer (globus url copy, the GridFTP client)
    • RLS client
  4. The Pegasus worker node software
  5. wget & curl: standard tools for downloading files with http and ftp. They're commonly pre-installed by the OS, but not consistently, so they are in the worker node client.
  6. UberFTP: command-line client for GridFTP
  7. SRM clients (v1 and v2)
  8. dccp, the dcache client
  9. MyProxy. This is for the MyProxy client tools, but the full MyProxy is installed including the server. This is because the MyProxy software isn't packaged in a way to easily distinguish client and server and we haven't pushed on making such a separation before.

Storage Types Supported at OSG Sites

All OSG sites should support the following storage:

Name What you can do Notes
$OSG_APP Install application software releases into via the CE. After install, area is read-only accessible from all batch slots on your cluster.
$OSG_SITE_READ Stage data to using gftp that is then readable from all batch slots. OSG sites have not consistently deployed this area.
$OSG_SITE_WRITE Stage output files to from all batch slots, for later asynchronous retrieval using gftp. OSG sites have not consistently deployed this area.
$OSG_DATA Data files that are accessible via NFS from all batch slots, read-write. OSG sites have not consistently deployed this area
$OSG_GRID Set of "client tools" that are part of the OSG software stack. Available at batch slot
$OSG_WN_TMP1 Temporary storage area in which your job(s) run Local to each batch slot

  1. OSG sites have inconsistently implemented $OSG_WN_TMP. The default configuration of OSG is to have a single top level $OSG_WN_TMP on each worker node which multiple jobs share simultaneously. Good-citizen grid users are expected to make a subdirectory below this directory in which to execute their job. For instance: export mydir=$OSG_WN_TMP/myWorkDir$RANDOM ; mkdir $mydir ; cd $mydir
    . A significant number of sites use the batch system to make an independent directory for each user job, and change $OSG_WN_TMP on the fly to point to this directory. In these cases they will usually advertise OSG_WN_TMP to be UNAVAILABLE. A few other sites simply dump all user jobs into a cluster-wide NFS-shared directory, very bad practice indeed.

There is no way to know in advance how much scratch disk space any given worker node has available, as OSG information systems don't advertise this. Often it is shared among a number of job slots. If your job has significant I/O it is important to make the subdirectory as above so you don't interfere with other jobs that may be using the node.

Input and Output Files Specified via condor-g jdl

Condor-g allows specification of the following:

executable one file
output one file - stdout
error one file - stderr
transfer_input_Files 1 comma separated list of files
transfer_output_files 2 comma separated list of files

ALERT! WARNING!
Do not use to transfer more than a few megabytes: these files are transfered via the CE headnode and can cause serious loads, which can bring down the cluster.

ALERT! WARNING!
Do not spool gigabyte-sized files via the CE headnode by condor file transfer. Space on the headnode tends to be limited, and some sites severely quota the gram scratch area via which these files are spooled. Instead, store them in the dedicated stage-out spaces and pull them from the outside as part of a DAG.

In the remainder of this document we describe how to find your way around these various storage areas both from outside the site before you submit your job, as well as from inside the site after your job starts inside a batch slot.

Finding your way around

HELP NOTE
A definition of the concepts used in this document can be found in Local Storage Configuration, including a section describing minimal requirements and some sample configurations.

Here we describe how to find the various storage locations. We start by describing how to find things after your job starts, and then complete the discussion by describing what you can determine from the outside, before you submit a job to the site. We deliberately do not describe what these various storage implementations are. That is done in the Local Storage Configuration document.

Setting Up $OSG_GRID

All OSG sites should be configured so that the $OSG_GRID environment variable is already defined when your job starts running. First see Running "source osg_grid_setup.sh"?.

Getting Information About CE Storage Before You Submit a Job

OSG sites advertise their properties via the Generic Information Provider (GIP). This information can be queried from outside the site. It is meant to be used to select sites that support the functionality you need for your applications.

Generic Information Provider (GIP)
A configurable LDAP information provider that differentiates between static and dynamic information. OSG sites use GIP to advertise a variety of grid-related configuration data. GIP is interoperable with LCG.

The GIP uses the GLUE schema, and all information may be read via LDAP queries. The following fields are of particular importance in the context of storage:

GLUE Schema name OSG Correspondence Notes
GlueCEInfoApplicationDir $OSG_APP  
GlueCEInfoDataDir $OSG_DATA  
Worker node client directory $OSG_GRID Published with:
GlueLocationLocalID=OSG_GRID
GlueLocationName=OSG_GRID
GlueLocationPath= the path to the directory

HELP NOTE
The GLUE schema allows the site to configure different paths for SITE_READ and SITE_WRITE and DEFAULT_SE for each VO. However, few sites are likely to do this as it requires manual editing of the schema file.

Grid Laboratory Uniform Environment (GLUE) schema
An abstract modeling for Grid resources and mapping to concrete schemas that can be used in Grid Information Services. It aims to define, publish and enable the use of common schemas for interoperability between the EU and US physics grid project efforts. See also GLUE Schema site.

Selecting GLUE Schema Attribures Using ldapsearch

The following syntax for ldapsearch works to get all GlueLocationLocalID — there are several — from fngp-osg.fnal.gov.

$ ldapsearch -x -h is.grid.iu.edu -xxx -p 2170   -b mds-vo-name=FNAL_GPFARM,mds-vo-name=local,o=grid   GlueLocationLocalID 

   version: 2
   
   #
   # filter: (objectclass=*)
   # requesting: GlueLocationLocalID
   #
   
   # fngp-osg.fnal.gov, local, grid
   dn: GlueClusterUniqueID=fngp-osg.fnal.gov, mds-vo-name=local,o=grid
   
   # fngp-osg.fnal.gov, fngp-osg.fnal.gov, local, grid
   dn: GlueSubClusterUniqueID=fngp-osg.fnal.gov, GlueClusterUniqueID=fngp-osg.fna
    l.gov, mds-vo-name=local,o=grid
   
   # OSG_SITE_READ, fngp-osg.fnal.gov, fngp-osg.fnal.gov, local, grid
   dn: GlueLocationLocalID=OSG_SITE_READ, GlueSubClusterUniqueID=fngp-osg.fnal.go
    v, GlueClusterUniqueID=fngp-osg.fnal.gov, mds-vo-name=local,o=grid
   GlueLocationLocalID: OSG_SITE_READ
   
   # OSG_SITE_WRITE, fngp-osg.fnal.gov, fngp-osg.fnal.gov, local, grid
   dn: GlueLocationLocalID=OSG_SITE_WRITE, GlueSubClusterUniqueID=fngp-osg.fnal.g
    ov, GlueClusterUniqueID=fngp-osg.fnal.gov, mds-vo-name=local,o=grid
   GlueLocationLocalID: OSG_SITE_WRITE
   
   # OSG_GRID, fngp-osg.fnal.gov, fngp-osg.fnal.gov, local, grid
   dn: GlueLocationLocalID=OSG_GRID, GlueSubClusterUniqueID=fngp-osg.fnal.gov, Gl
    ueClusterUniqueID=fngp-osg.fnal.gov, mds-vo-name=local,o=grid
   GlueLocationLocalID: OSG_GRID
   
   # search result
   search: 2
   result: 0 Success
   
   # numResponses: 6
   # numEntries: 5

Examples

This PERL script can be used to query either the GRIS or the BDII to get the OSG_GRID area.


Complete: 3
Responsible: StevenTimm - 18 Oct 2007
Reviewer - date: RobGardner - 09 Nov 2007
Comment: What can be done about the red warnings? (pound on each site individually where the inconsistency is discovered.)

Topic revision: r9 - 28 May 2010 - 20:31:09 - SuchandraThapa
 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..