Information Services and Storage
In order to fully utilize and track the storage capabilities of the Open Science Grid, various ancillary services come into play. Some of the services apply to both compute and storage elements. General information on this topic can be found on the OSG Monitoring and Information Systems?
Catalogs are used in association with storage elements to hold metadata for files, which are identified by their "Logical File Name". Metadata of interest to experiments includes such items as
- Physical file name (including path) or URL.
- Preference ranking of the above for downloads.
- File size.
- File checksum.
- Other experiment-specific info.
Catalog implementations and schema are often customized for a particular experiment. One generic catalog, the Replica Location Service, is included in the OSG software distribution as a part of the Globus component.
Replica Location Service
The Replica Location Service is primarily used to find the physical file name or URL for a given logical file name, though other attributes are supported. RLS consists of two components: the Local Replica Catalog, and the Replica Location Index. This is due to the fact that the path portion of the physical file name may be site-dependent. The lookup is a two-stage process: a set of Local Replica Catalogs is determined from the logical file name by querying the Replica Location Index, and each of these is consulted to determine the physical file name at a site. For more information on the Replica Location service, please see the globus web site
There are two monitoring services in use in the OSG that apply to storage.
The main monitoring service of the Open Science Grid is the Resource and Service Validation service. RSV is a component allows administrators to run probes locally on resources to test their functionality. The results are sent to a central collector at the Grid Operations Center
, where the information is collected, presented, and redistributed. Storage Elements are among the resources that have RSV probes. The two tests covered by the storage element RSV probe are srmcp and srmping. RSV results may be accessed through the GOC's MyOSG
page, where storage element results may be found in web page sections labeled by the storage element ID.
As on offering outside the OSG, the SRM consortium runs a monitoring service for SRM Storage Elements. All of the srm client commands are tested for each site. To use the service
The Discovery mechanism provides specific information on grid services that allow user operations to be planned and executed. For example, users of storage need to know which sites authorize their VO to access storage, and what specific storage services are offered; their endpoints, paths, etc. In the Open Science Grid, the means of providing this information is through the BDII, a centralized service which contains configuration information for each site and can be queried by users. The OSG BDII
is populated via processes based on the Generic Information Provider (GIP). See the Information Services Activity
page for more information on the GIP.
A tool for querying the BDII, the OSG Discovery Tool
, is provided through the VDT client package. Pre-written queries wrapped an a command-line interface are included, for which the user need only specify their VO. Queries related to storage include
- Example gridftp URL to use for a site.
- Example srm URL to use for a site.
- Example srm copy command.
- Example space reservation command.
- Mount points of storage systems on compute worker nodes.
The examples returned by the tool can be used to construct user-specific commands that should succeed on the storage resource. The mount points can be used when a process needs to access a storage element from a worker node using posix IO. The
mount point gives the necessary site-specific path information the defines the user-accessible storage area.
An example of the use of information services with Storage
Putting the Information Services components together with the Storage Element components is shown in the following diagram. Suppose a user wants to place a file on a Storage Element and at some later time run a job on an associated Compute Element, which accesses the file. In this scenario for the first part the user's script first executes a OSG Discovery Tool
command which tells it which sites authorize the user's VO and what path the VO is to use. A site is chosen and the script then consults monitoring information to verify that that the site is currently online. The file is then written through the srm server and its location is recorded in a catalog. For more demanding data placement requirements, a [#FileTransferServices][file transfer service]] may be needed.
For the job execution part, the script reads from the catalog to determine the sites which hold the particular file. A site is chosen and the monitoring info is checked to ensure that the site is online. During the job the file is either accessed directly from the storage element or temporarily copied from the storage element to the worker node. In the first case, files on the storage element must be available through mount points on the worker nodes; if so the mount point is determined by the discovery tool and inserted into the job script. Otherwise, the file is copied via a URL obtained from the catalog or from the composition of the filename with an endpoint and path found through the discovery service.
Automated matchmaking at job execution time with additional constraints on job execution environment and other parameters is available through the OSG Resource Selection Service. Attributes for storage elements are included in ReSS classads. For more information, please see the topic OSG Resource Selection Activity?
The OSG Matchmaker
is based on ReSS and adds additional classad information. Monitoring probes allow resource availability to be included in the matchmaking process, and VO-specific attributes are added to direct processing to resources owned by the experiment.
It is useful for site and grid administrators to have an overview of activity on their site. An accounting system typically shows plots or other kinds of reports of historical usage data.
The Gratia accounting service
keeps a record of resource usage at a site or in the OSG as a whole. The usage is recorded after the fact and does not interact with the actual running of jobs or storage transactions. Usage data is collected by "gratia probes" running on grid resources, which write either to a site's local collector or to a central OSG collector. There are two types of probes that report storage related information.
* The transfer probe reports to Gratia the details of each file transfer into or out of a file server. Because the the potentially large volume of information, the transfer probe writes to a site-local gratia collector. There are flavors of transfer probe, depending on the storage implementation being deployed.
- For Bestman using gridftp file servers or for stand-alone gridftp servers, the Gratia Transfer Probe is used. This probe works by scraping the gridftp log files.
- For dCache, the Gratia Dcache Probe is used. The probe gets this information from the dCache "billing" database and should run on the dcache node on which the dCache http domain is running. The probes are installed on their correct nodes by the VDT install scripts during an initial dCache installation or upgrade.
- The other type of storage probe is responsible for reporting storage capacity and storage usage. The reporting is to the central Gratia repository. The information reported is:
- The storage capacity and amount used for each dCache pool.
- The storage capacity and amount used for each SRM Space reservation.
Reports from the grid-central gratia service may be viewed from the MyOSG
page, though at present there are no pre-written reports for storage.
The dCache Chronicle sends daily summaries of capacity an usage of dCache storage elements via email. It is part of the OSG Operations Toolkit
and may be installed via rpm
A daily email report on capacity, usage, and health for Hadoop HDFS-based storage elements is available through a yum installation. Details may be found at the Hadoop Storage Probe
Please see the section File Transfer Services
in the Storage for the Application Developer
- 12 Apr 2010