Relocating Globus Directories

By default, Globus GRAM writes a handful of files per job in various directories:

  • State and lock files: This is a checkpoint of the GRAM job's state; defaults to /var/lib/globus/gram_job_state; there is one file per job. If you use Condor, there is additionally one log file per job. There's a third file, a lock file, which is a link to the per-globus-job-manager lockfile in /var/lib/globus/gram_job_state/$LOGNAME/$HOSTNAME. The latter directory contains a credential, a lock file, a pid file, and a communication socket.
  • Cached executables: Each executable is written into a cache; identical executables will only be written once; this defaults to $HOME/.globus/.gass_cache. There is at least one file per job (the executable). Except for Condor, this directory must be shared with worker nodes.
  • Job home directory: This directory is created per-job (defaults to $HOME/.globus/job/$HOSTNAME/$GRAM_ID). It provides the default $CWD on job startup, and it is where the stdout, stderr, remote I/O contact file, and proxy credentials are kept. Except for Condor, this directory must be shared with worker nodes.
    • A job can ask for a specific, non-default, starting directory. This is strongly discouraged and likely doesn't work at many OSG sites.
  • Scratch directory: If there are additional input files, a temporary directory ( of the form $HOME/gram_scratch_XYZ) is created; the input files are placed there. If there are no input files, this directory is not created. Except for Condor, this directory must be shared with worker nodes.

So, for a Condor job with 2 input files to successfully launch, 10 unique files and 2 new directories must be created; 7 of these go into the user's $HOME by default.

There are many good reasons to want to separate GRAM from $HOME. Condor requires no shared directories at all - but even for PBS, separating the two makes management of these Globus files easier.

Relocating directories:

The directories used by GRAM are controlled in /etc/globus/globus-gram-job-manager.conf. The relevant options are:

  • -state-file-dir. This controls the state and lock files; it defaults to /var/lib/globus/gram_job_state. We recommend keeping it there. The state file directory must be root-owned and mode 1777.
  • -cache-location. This controls the cache files; if you want to change this, we recommend setting it to /var/lib/globus/gass_cache/$(LOGNAME) and creating /var/lib/globus/gass_cache with root and mode 1777.
  • -globus-job-dir. This controls the job home directories; if you want to change this, we recommend setting it to /var/lib/globus/job_home (pre-create this directory as root-owned, mode 1777). See the note below about the RVF file.
  • -scratch-dir-base. This controls location of gram_scratch directories. If you want to change this, we recommend setting it to /var/lib/globus/gram_scratch (pre-create the directory, root-owned, mode 1777).

All the recommendations above are for Condor. For PBS, remember the last three directories must be exported to all the worker nodes and have the same directory name on the worker nodes. It might be convenient to use the prefix /mnt/nfs/globus instead of /var/lib/globus in such a case. None of the directories are written to by the root user, so root_squash for NFS mounts is acceptable.

RVF file changes

If you relocate the globus-job-dir above, you need to add the following text to /etc/globus/gram/job-manager.rvf:

# RSL Validation Information for the base RSL supported by the job
# manager
Attribute: directory
Description: "Specifies the path of the directory the jobmanager will use as
              the default directory for the requested job."
Default: /var/lib/globus/job_home/$(LOGNAME)
ValidWhen: GLOBUS_GRAM_JOB_SUBMIT
DefaultWhen: GLOBUS_GRAM_JOB_SUBMIT
Topic revision: r2 - 02 Nov 2012 - 14:52:02 - BrianBockelman
Hello, TWikiGuest!
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..