MonitoringInformation
RsvControl
Review Passed
by SuchandraThapa
Released
by ScotKronenfeld

Install and Configure the Resource and Service Validation

About this Document

hand This document is for System Administrators. It details the usage of the rsv-control command for enabling, disabling, and running rsv probes.

Using rsv-control

rsv-control is a script introduced in RSV 3.3.0 that provides an interface to many RSV tasks. rsv-control can view RSV jobs, run metrics, enable or disable metrics and consumers, and allow advanced configuration.

Note About Configuring Using rsv-control

rsv-control can be used to configure RSV but most site admins will be able to configure RSV with the following steps:
  • Edit the [RSV] section in config.ini
  • Run configure-osg -c
  • Done! Once you enable services (via vdt-control) you've got a working RSV installation.

Using rsv-control to configure is for advanced RSV use including enabling non-default metrics. Admins who don't use rsv-control for configuration can still use it to view their RSV jobs, run RSV tests, and help debug RSV problems.

Viewing RSV jobs

rsv-control provides two different interfaces: viewing the desired state and viewing the current actual state.

  • Desired = what metrics and consumers will start the next time RSV is started
  • Actual = what metrics and consumers are currently running

Desired state

To view the desired state, use the --list (-l for short) flag. This will create one table for each host showing the metrics that are enabled to run against that host.

$> rsv-control --list

Metrics enabled for host: osgitb1.nhn.ou.edu              | Service
----------------------------------------------------------+--------------------
org.osg.batch.jobmanager-default-status                   | OSG-CE
org.osg.batch.jobmanagers-available                       | OSG-CE
org.osg.certificates.cacert-expiry                        | OSG-CE
org.osg.certificates.crl-expiry                           | OSG-CE
org.osg.general.osg-directories-CE-permissions            | OSG-CE
org.osg.general.osg-version                               | OSG-CE
org.osg.general.ping-host                                 | OSG-CE
org.osg.general.vdt-version                               | OSG-CE
org.osg.general.vo-supported                              | OSG-CE
org.osg.globus.gram-authentication                        | OSG-CE
org.osg.globus.gridftp-simple                             | OSG-GridFTP
org.osg.gratia.condor                                     | OSG-CE
org.osg.gratia.metric                                     | OSG-CE


Metrics enabled for host: osg-edu.cs.wisc.edu:10443       | Service
----------------------------------------------------------+--------------------
org.osg.srm.srmcp-readwrite                               | OSG-SRM
org.osg.srm.srmping                                       | OSG-SRM

Other options:

  • To view all installed metrics use the --all (-a) flag along with --list. This will print an extra table showing metrics that are disabled on all hosts.
  • If you are having problems with the output being truncated, try the --wide (-w) flag.

Actual state

To view the current, running state of RSV jobs, use the --job-list flag (-j for short). This will show all metrics and consumers running in RSV. (It queries the underlying Condor Cron system that we use to run the metrics)

$> rsv-control --job-list

Hostname: osg-edu.cs.wisc.edu
     ID OWNER      ST NEXT RUN TIME   METRIC
  154.0 rsvuser    I  11-19 12:15     org.osg.certificates.cacert-expiry
  155.0 rsvuser    R  11-19 11:23     org.osg.gratia.metric
  156.0 rsvuser    I  11-19 18:47     org.osg.general.vdt-version
  157.0 rsvuser    I  11-19 12:30     org.osg.certificates.crl-expiry
  158.0 rsvuser    I  11-19 11:31     org.osg.globus.gram-authentication
  159.0 rsvuser    I  11-19 11:41     org.osg.general.osg-version
  160.0 rsvuser    R  11-19 11:25     org.osg.batch.jobmanager-default-status
  161.0 rsvuser    I  11-20 04:59     org.osg.batch.jobmanagers-available
  162.0 rsvuser    I  11-19 11:37     org.osg.general.osg-directories-CE-permissions
  163.0 rsvuser    I  11-19 12:08     org.osg.globus.gridftp-simple
  164.0 rsvuser    I  11-19 12:09     org.osg.gratia.condor
  165.0 rsvuser    R  11-19 11:27     org.osg.general.ping-host
  166.0 rsvuser    I  11-19 18:47     org.osg.general.vo-supported

Hostname: osg-edu.cs.wisc.edu:10443
     ID OWNER      ST NEXT RUN TIME   METRIC
  113.0 rsvuser    I  11-19 11:33     org.osg.srm.srmping
  114.0 rsvuser    R  11-19 11:28     org.osg.srm.srmcp-readwrite

     ID OWNER      ST CONSUMER
  198.0 rsvuser    R  html-consumer
  199.0 rsvuser    R  gratia-consumer

The ST field indicates the current job status:

  • R = the metric is currently running
  • I = the metric is idle and will be run at the next scheduled interval
  • Any other letter may indicate a problem
  • Consumers will always appear to be running even though they will only run once every five minutes.

Running a metric

rsv-control can be used to run metrics one time against a host. This can be useful for:

  • updating the status of a metric that had a problem instead of waiting until the next scheduled run time
  • testing a metric against a host before deciding whether to enable it

Note that the record for each run will be published to all active consumers. That is, it will be published to Gratia or will show up on your local web page, if you have those enabled.

Simplest test:

Use the --run (-r) flag. You must also provide the --host flag (--host can be abbreviated -u which is short for URI, the old and inaccurate RSV term for host [+ port]). The syntax is:

rsv-control --run --host HOST METRIC [ METRIC2 ...]

where METRIC is the full metric name (e.g. org.osg.general.osg-version). You can get the metric names from the --list output.

For example:

$> rsv-control --run --host osg-edu.cs.wisc.edu org.osg.general.osg-version

Running metric org.osg.general.osg-version:

metricName: org.osg.general.osg-version
metricType: status
timestamp: 2010-11-19 11:40:19 CST
metricStatus: OK
serviceType: OSG-CE
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: OSG 1.2.15
EOT

Note the metricStatus above: that's where you can see if it was successful or not. In this case, it was successful, because it printed OK.

Running multiple metrics

Running multiple metrics against a single host is easy, just specify multiple metrics to rsv-control, for example:

$> rsv-control -r -u osg-edu.cs.wisc.edu org.osg.general.vo-supported org.osg.globus.gram-authentication org.osg.globus.gridftp-simple

Running metric org.osg.general.vo-supported (1 of 3)

metricName: org.osg.general.vo-supported
metricType: status
timestamp: 2010-11-19 13:40:40 CST
metricStatus: OK
serviceType: OSG-CE
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: # List of VOs this site claims to support mis osgedu
EOT


Running metric org.osg.globus.gram-authentication (2 of 3)

metricName: org.osg.globus.gram-authentication
metricType: status
timestamp: 2010-11-19 13:40:55 CST
metricStatus: OK
serviceType: OSG-CE
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: GRAM Authentication test successful
EOT


Running metric org.osg.globus.gridftp-simple (3 of 3)

metricName: org.osg.globus.gridftp-simple
metricType: status
timestamp: 2010-11-19 13:40:56 CST
metricStatus: OK
serviceType: OSG-GridFtp
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: Gridftp was succesfully tested! Upload to and download from remote host succeeded; Received file is valid.
EOT

In order to run metrics against multiple hosts you must run rsv-control multiple times, once for each host.

Running all enabled metrics

When RSV is first installed it can take up to a day for each enabled metric to run once. A new option is provided to force each metric to run immediately, for all hosts. Use the --all-enabled flag along with --run. With this option it is not necessary to specify a host - all enabled metrics for all configured hosts will be run (in fact, if you do specify a host it will be ignored). For example:

$> rsv-control -r --all-enabled

Running metric org.osg.certificates.cacert-expiry (1 of 15)

metricName: org.osg.certificates.cacert-expiry
metricType: status
timestamp: 2010-11-19 13:44:08 CST
metricStatus: OK
serviceType: OSG-CE
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: Security Probe Version: 1.1
OK: CAs are in sync with OSG distribution
EOT


Running metric org.osg.gratia.metric (2 of 15)

.
.
.
Output trimmed

Passing extra configuration

Add a link here pointing at documentation about the configuration files

If you want to pass extra configuration when running a metric without editing its configuration file you can make an INI-formatted file and pass it on the command line. For example, you can make a file like this for the org.osg.srm.srmclient-ping metric.

$> cat tmp-srm.ini
[org.osg.srm.srmclient-ping args]
srm-destination-dir=/srmcache/~
srm-webservice-path=srm/v2/server

Then use the --extra-config-file parameter and pass the path to the INI file.

$> rsv-control -r --extra-config-file tmp-srm.ini -u osg-edu.cs.wisc.edu:10443 org.osg.srm.srmclient-ping

Running metric org.osg.srm.srmclient-ping:

metricName: org.osg.srm.srmclient-ping
metricType: status
timestamp: 2010-11-19 14:12:35 CST
metricStatus: OK
serviceType: OSG-SRM
serviceURI: osg-edu.cs.wisc.edu:10443
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: SRM server running on osg-edu.cs.wisc.edu is alive and responding to the srmping command.
.  Details: Storage Resource Manager (SRM) Client version 2.1.5-16
Copyright (c) 2002-2009 Fermi National Accelerator Laboratory

Output trimmed

Enabling and disabling metrics and consumers

Metrics and consumers can be enabled or disabled by rsv-control using the --enable and --disable flags. Note that "enable" and "disable" are desired states (this is similar to vdt-control). After enabling a metric you should turn it on if you want it to be running immediately. After disabling a metric that is running, you should still turn it off (a message will print after each of these actions to remind you of this behavior).

Enabling

The syntax for enabling metrics looks similar to the syntax for running metrics:

rsv-control --enable --host HOST METRIC [ METRIC2 ...]

You must provide a host to enable the metric against (in order to enable a metric on multiple hosts you must run rsv-control once per host).

Consumers do not run against a specific host, they process records for all hosts. When enabling consumers a host is not required (if a host is passed it will be ignored).

Example of enabling a metric:

$> rsv-control --enable --host osg-edu.cs.wisc.edu org.osg.gip.consistency 
Enabling metric 'org.osg.gip.consistency' for host 'osg-edu.cs.wisc.edu'

One or more metrics have been enabled and will be started the next time RSV is started.  To turn them on immediately run 'rsv-control --on'.

Example of enabling a consumer:

$> rsv-control --enable nagios-consumer
Enabling consumer nagios-consumer

Example of enabling multiple metrics:

$>  rsv-control --enable --host vdt-itb.cs.wisc.edu org.osg.local.hostcert-expiry org.osg.local.httpcert-expiry org.osg.local.containercert-expiry
Enabling metric 'org.osg.local.hostcert-expiry' for host 'vdt-itb.cs.wisc.edu'
Enabling metric 'org.osg.local.httpcert-expiry' for host 'vdt-itb.cs.wisc.edu'
   Metric already enabled
Enabling metric 'org.osg.local.containercert-expiry' for host 'vdt-itb.cs.wisc.edu'

One or more metrics have been enabled and will be started the next time RSV is started.  To turn them on immediately run 'rsv-control --on'.

Disabling

The syntax for disabling metrics looks similar to the syntax for running metrics:

rsv-control --disable --host HOST METRIC [ METRIC2 ...]

You must provide a host to disable the metric against (in order to disable a metric on multiple hosts you must run rsv-control once per host).

Consumers do not run against a specific host, they process records for all hosts. When disabling consumers a host is not required (if a host is passed it will be ignored).

Example of disabling a metric:

$> rsv-control --disable -u vdt-itb.cs.wisc.edu org.osg.local.containercert-expiry
Disabling metric 'org.osg.local.containercert-expiry' for host 'vdt-itb.cs.wisc.edu'

One or more metrics have been disabled and will not start the next time RSV is started.  You may still need to turn them off if they are currently running.

Example of disabling multiple consumers:

$> rsv-control --disable html-consumer gratia-consumer
Disabling consumer html-consumer
Disabling consumer gratia-consumer
   Consumer already disabled

Metrics and consumers can both be listed in the same disable command.

Troubleshooting

Getting more information from rsv-control

The first step to getting more information is to run rsv-control with more verbosity. Use the --verbose (-v) flag. This flag can be used with any of rsv-control's abilities (run, enable, list, etc). The verbosity levels are:

  • 0 = print nothing
  • 1 = print warnings and errors along with usual output of command being run (1 is the default level)
  • 2 = adds informational messages
  • 3 = full debugging output

For example, here is the output when running a metric with -v2. Some important lines to look for are in bold:

$> rsv-control -r -u osg-edu.cs.wisc.edu org.osg.general.vdt-version -v2

INFO: Reading configuration file /home/kronenfe/new-rsv/osg-rsv/etc/rsv.conf
INFO: Reading configuration file /home/kronenfe/new-rsv/osg-rsv/etc/consumers.conf
INFO: Validating configuration:
INFO: Validating user:
INFO:     Invoked as root.  Switching to 'rsvuser' user (uid: 501 - gid: 501)
INFO: Registered consumers: html-consumer
INFO: Loading config file '/home/kronenfe/new-rsv/osg-rsv/meta/metrics/org.osg.general.osg-version.meta'
INFO: Loading config file '/home/kronenfe/new-rsv/osg-rsv/etc/metrics/org.osg.general.osg-version.conf'
INFO: Config file '/home/kronenfe/new-rsv/osg-rsv/etc/metrics/osg-edu.cs.wisc.edu/org.osg.general.osg-version.conf' does not exist
INFO: Checking proxy:
INFO:     Using service certificate proxy
INFO: Running command with timeout (300 seconds):
        /usr/bin/openssl x509 -in /tmp/rsvproxy -noout -enddate -checkend 21600
INFO: Exit code of job: 0
INFO:     Service certificate valid for at least 6 hours.
INFO: Pinging host osg-edu.cs.wisc.edu:
INFO: Running command with timeout (300 seconds):
        /bin/ping -W 3 -c 1 osg-edu.cs.wisc.edu
INFO: Exit code of job: 0
INFO:     Ping successful

Running metric org.osg.general.osg-version:

INFO: Forming arguments:
INFO:     Adding -x because probe version is v3
INFO:     Adding --verbose because probe version is v3
INFO:     Arguments: '-x /tmp/rsvproxy --verbose '
INFO: Setting up job environment:
INFO: Running command with timeout (300 seconds):
        /home/kronenfe/new-rsv/python/python-setup.py
INFO: Exit code of job: 0
INFO: VDT PYTHONPATH = /home/kronenfe/new-rsv/python/lib/python2.4/site-packages

INFO: Running command with timeout (300 seconds):
        /home/kronenfe/new-rsv/perl/perl-setup.pl
INFO: Exit code of job: 0
INFO: VDT PERL5LIB = /home/kronenfe/new-rsv/perl/lib/5.8.0:/home/kronenfe/new-rsv/perl/lib/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/vdt/lib/perl:/home/kronenfe/new-rsv/perl/lib/5.8.0:/home/kronenfe/new-rsv/perl/lib/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/vdt/lib:
INFO:     Var: 'X509_CERT_DIR' Action: 'SET' Value: '/home/kronenfe/new-rsv/globus/TRUSTED_CA'
INFO:     Var: 'PERL5LIB' Action: 'PREPEND' Value: '/home/kronenfe/new-rsv/osg-rsv/bin/probes:/home/kronenfe/new-rsv/perl/lib/5.8.0:/home/kronenfe/new-rsv/perl/lib/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/vdt/lib/perl:/home/kronenfe/new-rsv/perl/lib/5.8.0:/home/kronenfe/new-rsv/perl/lib/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0:/home/kronenfe/new-rsv/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:/home/kronenfe/new-rsv/vdt/lib:'
INFO:     Var: 'LD_LIBRARY_PATH' Action: 'APPEND' Value: '/home/kronenfe/new-rsv/globus/lib'
INFO:     Var: 'GLOBUS_LOCATION' Action: 'SET' Value: '/home/kronenfe/new-rsv/globus'
INFO: Running command with timeout (300 seconds):
        /home/kronenfe/new-rsv/osg-rsv/bin/metrics/org.osg.general.osg-version -m org.osg.general.osg-version -u osg-edu.cs.wisc.edu -x /tmp/rsvproxy --verbose
INFO: Exit code of job: 0
INFO: Creating record for html-consumer consumer at '/home/kronenfe/new-rsv/osg-rsv/output/html-consumer/org.osg.general.osg-version.hrN3-v'
INFO: Result:

metricName: org.osg.general.osg-version
metricType: status
timestamp: 2010-11-19 14:16:39 CST
metricStatus: OK
serviceType: OSG-CE
serviceURI: osg-edu.cs.wisc.edu
gatheredAt: vdt-itb.cs.wisc.edu
summaryData: OK
detailsData: OSG 1.2.15
EOT

Using the RSV verify tool

The --verify flag will run some basic checks for your RSV installation. For example:

$> rsv-control --verify
Testing if Condor-Cron is running...
OK

Testing if metrics are running...
OK (98 running metrics)

Testing if consumers are running...
OK (1 running consumers)

Checking which consumers are configured...
The following consumers are enabled: html-consumer
WARNING: The gratia-consumer is not enabled.  This indicates that your
         resource is not reporting to OSG.

This tool is still under development and it does only basic checks, but it is a good first step when debugging issues.

Running the rsv-profiler

If you need more help you can contact goc@opensciencegrid.org. For most problems the RSV support team will ask you to generate a profile tarball to share information (including log and configuration files). You can save some time by doing so with your original request:

$> rsv-control --profile
Running the rsv-profiler...
OSG-RSV Profiler
Analyzing...
Making tarball (rsv-profiler.tar.gz)

Topic revision: r13 - 22 Feb 2012 - 16:42:57 - KyleGross
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..