Local Storage Configuration
This document describes how to configure the OSG attributes, including ones referencing "CE storage", during the installation and afterwards (if the CE layout needs to change).
You must make these attributes known to OSG. They will be published as part of the GLUE schema using the GIP and used directly or indirectly by other OSG applications and users submitting jobs. The configuration is set in
. This is a standard configuration file that you can edit or review directly. The configuration script
automates much of the configuration. The meaning and purpose of the various elements of the configuration attributes are documented further below, and in the GLUE documentation
. New resource administrators may want to read that information carefully and determine how to map those elements onto their Resource before proceeding. Guidance on the basic elements and common defaults is provided below.
Gather configuration information
OSG strives to make resources available with minimal requirements; however, the grid requires certain information about files and filesystem mount points to provide a basic execution environment. For applications to be installed and to be executed correctly, filesystem sharing and the filesystem mount points available for a cluster must be specifically coordinated. For this purpose, administrators must define special directory hierarchies (mount points) and allocate them in the OSG environment. Many of these mount points should be available on the head / gatekeeper node and, using the exact path, on each of the worker nodes. Generally, they do not
have to be made available in the form of a shared filesystem across the whole cluster. Read-only spaces can generally be provisioned with or without a shared filesystem as long as you provide consistent paths.
This points to the OSG software installation location on the CE. It must be writable by root. This attribute is automatically setup by the
script. The $OSG_LOCATION directory should not be exported to the worker nodes.
Where OSG worker-node client software is installed; see Worker Node Client
for a description. $OSG_GRID includes client utilities for Grid middleware, such as
. It should be writable by root and readable by all users. It must be accessible by both gatekeeper and worker nodes via a shared filesystem, or different installations on local disks using a consistent pathname.
Base location for VO-specific application software.
$OSG_APP is read-only mounted on all worker nodes in the cluster. Only users with software installation privileges in their VO should have write privileges to this directories. At least 10 GB of space should be allocated per VO.
$OSG_DATA or $OSG_SITE_READ and $OSG_SITE_WRITE
The data directories are intended as the spaces for applications to write input and output data files with persistency that must exceed the lifetime of the job which created it.
- These directories should be writable by all users.
- Users will be able to create sub-directories which are private, as provided by the filesystem.
- At least 10 GB of space should be allocated per worker node; some VOs require much larger allocations.
The following different options are possible:
- $OSG_DATA: shared directory with read-write access for all users
- $OSG_SITE_READ: shared directory with read-only access for all users (data may be prestaged by the administrator or using a SE pointing to the same space)
- $OSG_SITE_WRITE: shared directory with write-only access for all users (data may be staged out by the administrator or using a SE pointing to the same space)
A CE can provide $OSG_DATA, both $OSG_SITE_READ and $OSG_SITE_WRITE, or none of them if it has a local SE specified in $OSG_DEFAULT_SE. If a particular hierarchy is not available on your CE, provide the keyword
- The $OSG_DATA, $OSG_SITE_READ and $OSG_SITE_WRITE directories must be accessible from the head node as well as each of the worker nodes.
A temporary directory local to the worker node, used as a working directory.
- At least 10 GB per virtual CPU should be available in this directory (e.g. a WorkerNode with 2 hyperthreaded CPUs that can run up to 4 jobs, should have 40GB).
- Files placed in this area by a job may be deleted upon completion of the job.
A storage element that is close and visible from all the nodes of the CE, both worker and head node.
The value to be specfied in $OSG_DEFAULT_SE is the full URL, including method, host/port and path of the base dir. This full URL must be reachable from inside as well as outside the cluster. The $OSG_DEFAULT_SE generally supports only put and get, rather than open/read/write/close.
If the CE has no default SE it can use the value UNAVAILABLE for $OSG_DEFAULT_SE.
- The current release supports SRM and gftp for $OSG_DEFAULT_SE.
Run the following script as root to execute the configuration script.
> cd VDT_LOCATION/monitoring
> cd $VDT_LOCATION/monitoring
script creates the
that is a link to it. The script also creates
which duplicates some of the attributes from
so that they can be placed in the job environment as well as
file is to allow site admins to set site specific environment variables that should be present for jobs that are running on the cluster. The This file is used by several monitoring services and applications to obtain basic resource configuration information. This file is required
to be readable from that location for OSG CEs.
The resource owner may choose which information services to run to advertise this information. Configuration of several of the more popular ones is described below in the Monitoring section.
An already installed OSG site can be reconfigured by editing
, or by rerunning the
All OSG sites should support the following storage:
- The standard is to start your job(s) in a dedicated directory on one of the worker node's local disks and configure $OSG_WN_TMP to point to that directory. However, many OSG sites simply place your jobs on some shared filesystem and expect you to do the following before starting the job:
export mydir=$OSG_WN_TMP/myWorkDir$RANDOM ; mkdir $mydir ; cd $mydir
. This is particularly important if your job has significant I/O.
Input and Output Files Specified via Condor-G JDL
Condor-G allows specification of the following:
| one file
| one file - stdout
| one file - stderr
| comma seperated list of files
| comma seperated list of files
- Do not use to transfer more than a few megabytes: these files are transfered via the CE headnode and can cause serious loads, which can bring down the cluster.
- Do not spool gigabyte-sized files via the CE headnode by condor file transfer. Space on the headnode tends to be limited, and some sites severely quota the gram scratch area via which these files are spooled. Instead, store them in the dedicated stage-out spaces and pull them from the outside as part of a DAG.
In the remainder of this NOTE% document we describe how to find your way around these various storage areas both from outside the site before you submit your job, as well as from inside the site after your job starts inside a batch slot.
Finding your way around
A definition of the concepts used in this document can be found in Local Storage Requirements?
, including a section describing minimal requirements and some sample configurations?
Here we describe how to find the various storage locations. We start by describing how to find things after your job starts, and then complete the discussion by describing what you can determine from the outside, before you submit a job to the site. We deliberately do not describe what these various storage implementations are.
This topic describes how site administrators can define CE local storage paths so that their site will be configured in a way that OSG users expect.
Local storage definitions define the paths to the disk spaces or Storage Elements (SEs) that are accessible to jobs from within a OSG Compute Element (CE). These can be set in a variety of ways depending on what you need for your CE and your site configuration.
This section also covers how these environment variables correspond to external schemas (e.g. GLUE, Grid3/OSG).
Provide the keyword
(instead of a path) if your CE does not support a particular CE storage area. This distinguishes CEs that provide support for only certain CE storages from those that simply are not configured.
- These values do not refer to any storage space as viewed from outside (e.g., through a Storage Element or GridFTP server).
OSG - LDAP - Glue table
The following table shows the OSG storage variable name, its associated attribute name in the GLUE Schema 1.2, its LDAP attribute name (as in Grid3), and a description.
- The same attribute may appear in more than one place in the GLUE Schema.
- GLUE provides the possibility to have multiple values for some of the CE storage, depending on the VO and the Role (VOMS FQAN). These are currently sitewide information within OSG.
- The GLUE Schema does not have an specific attribute for SITE_WRITE or SITE_READ, but it provides the location entity (Name/Version/Path sets) to accommodate additional CE local storage. In order to accommodate that, locations will have to be defined through the GIP:
- LocalID: GRID+OSG, Name:GRID, Version: OSG, Path:
- LocalID: SITE_WRITE+OSG, Name:SITE_WRITE, Version: OSG, Path:
- LocalID: SITE_READ+OSG, Name:SITE_READ, Version: OSG, Path:
Local Storage Configuration Models
Although you are not required to provide all of the OSG storage area definitions, you must
implement at least one of the OSG storage models
. Providing a wider selection of CE storage areas allows users to select the one that best fits their jobs. It can also allow users to define jobs with inefficient execution models that could reduce the performance of the whole cluster. It's a trade-off.
However, unless there is a very good reason not to, you should provide $OSG_DATA for users since many VOs depend on this being present for their applications to use.
Methods of Deploying CE Storage Definitions
Declaring a storage area to be local to the CE can be done in a variety of ways, including (but not limited to):
- Defining environment variables that resolve to the correct path or URL for the storage areas.
- Making paths or URLs consistent across the CE (headnodes and WNs), and publishing this using an information provider (e.g. GIP/BDII). Users would have to do a lookup before submitting the job, so that it can carry the information necessary to use this CE.
Each CE administrator must ensure the client software will function correctly (i.e., for Globus and SRM, access to the user's proxy, gass_cache mechanism) for all jobs running on the CE. Common practice is to use a shared $HOME directory, but you can use other mechanisms as long as they are transparent to the users, who should not have any assumptions about the CE that differ from what is stated here. It is unsafe to make assumptions about the existence and characteristics (size, being shared, etc.) of the
directory. In particular, site administrators are free to deploy configurations that do not include any NFS exports from the CE, such as those described in OSG document 382
- 18 Oct 2007
Reviewer - date: RobGardner
- 08 Nov 2007
Comment: Needs some editing for clarity