Installation ITBStorage Element


This page is an overview of the installation and testing procedure for a Storage Element on the Open Science Grid's Integration Test Bed.


Installation

The following components should be installed for either dCache or Bestman

BeStMan

BeStMan server

BeStMan is a full implementation of SRM v2.2, developed by Lawrence Berkeley National Laboratory, for a disk based storage and mass storage systems such as HPSS. End users may have their own personal BeStMan that manages and gives an SRM interface to their local disks. It works on top of existing disk-based unix file system, and has been reported so far to work on file systems such as NFS, PVFS, AFS, GFS, GPFS, xrootdFS and Lustre. It also works with any existing file transfer service, such as gsiftp, http, https, bbftp and ftp. It requires the minimal administrative efforts on the deployment and updates.

Distribution package is available through VDT pacman installation as well as tar file download from SDM group at LBNL. Configuration options and validation samples are available on this link.

If storage spans on multiple file systems and composite storage systems, SRM/dCache should be considered.

dCache

dCache is a storage element implementation that meets the SRM v2.2 specification. dCache is not packaged with the base Virtual Data Toolkit installation, but is available throught the VDT as a separately downloadable package. In addition, OSG provides Gratia probes for site and grid-wide monitoring, Generic Information Providers for the support of information services, and a suite of test scripts to validate a dCache installation.

dCache server

VDT has bundled the dCache server rpm into a package which includes a configuration dialog and installation scripts for postgres and pnfs, which are services used by dCache. To install dCache, download the package from VDT Downloads. After untarring the package either follow the instructions in the README or refer to Installation Procedure.

After dCache is installed, authorization for access must be configured. Please refer to the gPlazma chapter of the dCache book on how to configure authorization for your site's policies.

Setting up the Replica Manager

The Replica Manager feature of dCache automatically makes extra copies of files that are written into the storage element. The purpose is to increase performance for the subsequent reading of files and to offer some protection against data loss in non tape-backed systems. In the VDT package for dCache the Replica Manager is turned on by default with parameters of a minimim of two and a maximum of three replicas per file. However, dCache pools must be explicitly added to the "ResilientPools" group in the Trash.StoragePoolManager.conf file in order for replication to take effect. See Overview of Storage Resource Manager (SRM) and dCache for OSG for details of the dCache pools and Trash.StoragePoolManager. Only files written to resilient pools will be replicated.

For instructions on the replica manager setup, please see Configuration of Storage Elements for Replica Management.

Setting up Space Reservation for Opportunistic Use

The Space Reservation feature of dCache allows users to have a guarantee of a given amount of storage for a period of time. For opportunistic use, the concept is for users to make a space reservation in the context of supporting a job running on a compute element. The reservations will be for relatively small amounts of storage for short duration. See the Use of Storage in OSG document for details. In the VDT packaging of dCache, Space Reservation is turned on by default. However, dCache pools must be explicitly added to the "public" pool group in the Trash.StoragePoolManager.conf file in order for space reservation to take effect. Files written using space reservations will always be written to the associated pools.

For instructions on the opportunistic storage setup, please see Configuration of Storage Elements for Opportunistic Use.

dCache Gratia probes

The probes report storage related information to the central Gratia collector. The probes are installed on their correct nodes by the VDT install scripts during an initial dCache installation or upgrade.

There are two types of probes:

  • The transfer probe reports to Gratia the details of each file transfer into or out of a dCache file server.

The probe gets this information from the dCache "billing" database and should run on the dcache node on which the dCache http domain is running. For performance reasons, sites with large dCache billing databases are advised to alter the "billinginfo" table by adding an index on the pair of columns (datestamp, transaction) and to alter the "doorinfo" table by adding an index on the (transaction) column. This should speed up the search for newly added records.

The transfer probe will start automatically at boot time if the installation option to do so is selected. Otherwise, start it with

service start gratia-dcache-transfer

The probe will then run continuously in the background

  • The storage probe is responsible for reporting storage capacity and storage usage to the central Gratia repository. The information reported is:
    • The storage capacity and amount used for each dCache pool.
    • The storage capacity and amount used for each SRM Space reservation.

It gets the pool information from the dCache admin server. It gets the SRM information from the SRM tables in the SRM "srmdcache" database. Therefore it should run on the dCache node on which srm is running. The probe runs as a cron job on the host running SRM, so no action need be taken to start it.

Information how to verify the reports generated by the probes is provided in their respective README files (under nder /opt/d-cache/gratia/probe/dCache-transfer /opt/d-cache/gratia/probe/dCache-storage). In brief, to verify that the Gratia probes are working, run some transfers, including the use of space reservation. A convenient way to do that is to run the validation test suite (see below) and then check Gratia collector information.

The details how to install and configure dCache Gratia probes are provided in Gratia dCache Installation Procedure.

dCache Generic Information Provider

The purpose of the dCache Generic Information Provider is to discover and publish all storage related information corresponding to your dCache based Storage Element. After you have installed the OSG CE, please refer to documentation on how to configure the GIP for the SE on the dCacheGIP page.

Note: At the moment, to get the most up-to-date, bug-free version, you'll need to switch from the GIP distributed with ITB 0.9.0 to SVN. The instructions on how to do this are available on the GipInstall page.

dCache Validation Suite

The dCache validation test suite is designed to test basic dcache functionality as well as more advanced dcache features. It is developed and maintained at Fermilab. This test suite is mainly intended for use by the Storage Element admins and is a good first-level check of the SE. If all tests within the testsuite are successful, it means the basic functionality exists on the server side.

The suite consists of three major parts:

  • Fermi SRM Client Test Suite. The main purpose of the Fermi SRM Client Test Suite is to run various srmclient commands developed at Fermilab against a dCache based Storage Element (SRM V2.2 only). Various tests that are run as part of this test suite include:
    • srmcp (Get/Put operations, with and without Space Reservation)
    • srmmkdir (Create new directories)
    • srmmv (Move directory from one location to another)
    • srmls (List contents of a directory)
    • srm-get-permission (Get permissions on a file/directory)
    • srm-check-permission (Check permissions on a file/directory)
    • srm-set-permission (Set permissions on a file/directory)
    • srm-reserve-space (Make a space reservation)
    • srm-release-space (Release a space reservation)
    • srmrm (Remove file)
    • srmrmdir (Remove directory)

warning If your dcache configuration doesn't support opportunistic storage some of the commands should fail (e.g srmreservespace, srmreleasespace, srmcp with space token option). See the above section on how to set up opportunistic storage.

  • SRM Space Management test suite. It tests space allocation/release with various options including:
    • retention_policy
    • guaranteed_size
    • desired_size
    • lifetime
    • access_latency

warningIf your dcache configuration doesn't support opportunistic storage you should skip this test. See the above section on how to set up the replica manager.

  • Replica Manager test suite. It performs the following tasks:
    • copy multiple files into storage
    • verify report of disk usage
    • check the number of replicas for each files
    • delete these files from storage
    • verify consistency of information in Pnfs, pools and Replica Manager
    • verify report of disk usage

warningIf your dcache configuration doesn't support the replica manager you should skip this test. See the above section on how to set up the replica manager.

The validation suite is an rpm package that can be downloaded from VDT dcache tools. You will have to modify the test configuration according to your dcache installation. A detailed description of configuration is provided in README file contained in the rpm.

Installing the Validation Test Suite
The validation suite is an rpm package. After installing it, change the ownership of the installation root directory to the uid of the user who will be running the test with voms-proxy certificate.
wget http://vdt.cs.wisc.edu/software/dcache/tools//testing/dcache_validation-0.1-0.noarch.rpm
rpm -i --prefix <your_home_area> dcache_validation-0.1-0.noarch.rpm
chown -R <your_user_name>.<your_group_name> <your_home_area>/dcache_validation-0.1

Running the Validation Test Suite

The detailed instructions how to run the test, what software should be installed on your machine and how to see the results are provided in dcache_validation-0.1/README. To see the results of the Validation Test Suite on the web follow the instructions provided in README of the suite's installation. an example of the test suite results can be found here.

warning In order to see this example you have to have your user certificate installed in your browser.

Registration for SRM Monitoring at LBNL

To register your SE with LBNL Monitoring system: go to http://datagrid.lbl.gov/sitereg/ and follow the instructions to register.

Daily functional monitoring for all SRM interfaces is done around 9am Pacific time.

Testing Opportunistic Storage

The Fermi client commands for using space reservations are described in Using Opportunistic Storage. These commands can be run from any client machine which has them installed (see above). When there is a Compute Element associated with a Storage Element, jobs containing the commands described therein should be created and run on worker nodes. To test opportunistic storage from a Compute element please run the oppstor_test.py script (remove the .txt extension) from a worker node. This script does several tests of making, using, and releasing space reservations using the Fermi clients. To run it, you only need a proxy and to have the Fermi clients in the path. Use the command

jython oppstor_test.py  srm://srmnode.oursite.edu:8443 /pnfs/oursite.edu/data/oppstorage/test

with the URL of the srm server and desired path for the written files.

Removal of files used in opportunistic storage that are no longer needed allows the space to be recycled. To do this please use the SRM PNFS Space Reclaimer of the OSG Storage Operations Toolkit (see below).

Installation of Clients

Fermi SRM Clients

This set of srmclient commands (developed and maintained at Fermilab), can be used to access/validate a Storage Element. The location of this package in an ITB CE install (based on VDT 1.10.0) is $VDT_LOCATION/srm-v1-client. Once you source the $VDT_LOCATION/setup.s(c)sh file, all these commands should be available in your PATH.

Note - Even though the name suggests that Fermi srmclients only support SRM protocol version 1.1, this is not true. The Fermi srmclients infact support both SRM Protocol versions 1.1 and 2.2. This name will be changed in VDT 1.10.1

Note - If you are interested in getting the latest Fermi srmclient package and can not wait for a new VDT release, you can download the rpm from the main dCache website. After you have installed the rpm, by default, the commands will be available in the /opt/d-cache/srm/bin directory.

LBNL SRM Clients

A set of SRM client commands, developed at LBNL as generic SRM v2.2 clients, are available to access any SRM v2.2 based storage components. They have been tested for all current SRM v2.2 implementations such as BeStMan?, CASTOR, dCache, DPM, SRM/SRB and StoRM?. They are continuously being tested for compatibility and interoperability. This can be installed from the VDT distribution or from the tar file download from SDM group at LBNL. Sample command line examples for BeStMan at NERSC and for dCache at FNAL are available on this link?.

LCG Utils

LCG Utils is a suite of client tools for data movement written for the LHC Computing Grid. The tools are based on the Grid File Access Library, which is also included. Commands with access SRM servers are conformant to the SRM v2.2 specification. However, some commands require a connection to a BDII-based catalog. File copies and deletions based on the SRM URL alone are possible. Examples are written below.

LCG-Utils in VDT is built starting with version 1.10.0. LCG Utils can be installed via the pacman command

pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:LCG-Utils

Choose the "install locally" option for the CA certificates to avoid conflict with existing GSI infrastructure.

After running ". setup.sh", basic commands can be executed as follows. For a copy command, after creating a user proxy, run

lcg-cp -v -b -D srmv2 file:/testdata/test1 <SRM URL>

Here the SRM URL is to an SRM v2.2 web service endpoint. Copies between storage elements are also allowed.

To delete a file, use the VO from the proxy and execute

lcg-del -b -v -l --vo <VO> -D srmv2 <SRM URL>

A lcg-utils test script may be used. The script writes to a SRM v2.2 storage element, obtains TURLS for the written files, and deletes the. Change the definitions at the beginning of the script to match your site and VO.

Documentation for LCG-Utils may be found the the LCG User Guide. Note that the requirement of setting LCG_GFAL_INFOSYS applies only when using a BDII server.

More information

OSG Storage Operations Toolkit

Download and details are available at http://datagrid.ucsd.edu/toolkit/ . This is a site community-contributed toolkit in Open Science Grid, packaged as a bundle of RPMs, each RPM with specific utilities. Goal is to provide a boost in effectiveness and efficiency of operating the deployed storage on OSG, as well as, on peer EU Grids. Toolkit has received wide adoption and is in active usage by almost all large production-scale sites in US as well as many EU sites in WLCG.

Links


Complete: 2
Responsible: TedHesselroth - 29 Apr 2008
Reviewer - date: RobGardner - 22 May 2008
Comment: Works also for production OSG, no?

Topic attachments
I Attachment Action Size Date Who Comment
txttxt oppstor_test.py.txt manage 16.1 K 06 May 2008 - 19:50 UnknownUser A python script to test space reservation from the client side.
shsh test_lcg_utils_vdt.sh manage 0.8 K 01 May 2008 - 19:53 UnknownUser Test Script for LCG-Utils
Topic revision: r32 - 06 Oct 2010 - 21:40:53 - TanyaLevshina
 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..