Installing, Configuring, Using, and Troubleshooting RSV
The Resource and Service Validation (RSV) software helps a site administrator verify that certain site resources and services are working as expected. OSG recommends that sites install and run RSV, but it is optional; further, each site selects which specific tests (called probes
) to run.
Use this page to learn more about RSV in general, and how to install, configure, run, test, and troubleshoot RSV from the OSG software repositories. For documentation on specific probes or on how to write your own probes, please check the Reference section
The Resource and Service Validation (RSV) software provides OSG site administrators a scalable and easy-to-maintain resource and service monitoring infrastructure. The components of RSV are:
- RSV Client. The client tools allow a site administrator to run tests against their site by providing a set of tests (which can run on the same or other hosts within a site), HTCondor-Cron for scheduling, and tools for collecting and storing the results (using Gratia). The client package is not installed by default and may be installed on a CE or other host. Generally, you configure the RSV client to run tests at scheduled time intervals and then it makes results available on a local website. Also, the client can upload test results to a central collector (see next item).
- RSV Collector/Server. The central OSG RSV Collector accepts and stores results from RSV clients throughout OSG, which can be viewed in MyOSG, on the “Current RSV Status” page and under the “Resource Group” menu.
- Periodic Availability Reports. The availability of all active registered OSG resources and the services running on each of those resources is calculated using the results received for critical metrics. Once a day, these availability numbers are published online and via email as explained here (More information: Outline of reports, Installation guide for GOC staff).
- RSV-SAM Transport. The WLCG RSV-SAM Transport infrastructure pushes out RSV results, for resources that are flagged to be part of the WLCG Interoperability agreement, from the GOC collector to WLCG's Service Availability Monitoring (SAM) system. More information on viewing these results is available here.
- MyOSG and OIM Links. RSV picks up resource information, WLCG interoperability information, etc., from a MyOSG resource group summary listing, which is in turn based on the OSG Information Management (OIM) (topology) system (Requires registration). Resource maintenance scheduled on OIM, are forwarded to WLCG SAM, if applicable.
Before starting the installation process, consider the following points (consulting the Reference section below
- User IDs: If they do not exist already, the installation will create the Linux user IDs
- Service certificate: The RSV service requires a service certificate (
/etc/grid-security/rsv/rsvcert.pem) and matching key (
- Network ports: To view results, port 80 must accept incoming requests; outbound connectivity to tested services must work, too
- Host choice: Install RSV on your site CE unless you have specific reasons (e.g., performance) for installing on a separate host
As with all OSG software installations, there are some one-time (per host) steps to prepare in advance:
An installation of RSV at a site consists of the RSV client software, the Apache web server, parts of HTCondor (for its cron-like scheduling capabilities), and various other small tools. To simplify installation, OSG provides a convenience RPM that installs all required software with a single command.
- Consider updating your local cache of Yum repository data and your existing RPM packages:
[root@rsv ~]$ yum clean all --enablerepo=*
[root@rsv ~]$ yum update
update command will update all packages on your system.
- If you have installed HTCondor already but not by RPM, install a special empty RPM to make RSV happy:
[root@rsv ~]$ yum install empty-condor --enablerepo=osg-empty
- Install RSV and related software:
[root@rsv ~]$ yum install rsv
(Optional) Special one-time clean-up instructions for RSV perfSONAR 1.1.2 or later
If you run
and have upgraded from version 1.1.1 or earlier to version 1.1.2 or later, there is a clean-up step you should take to fix an unnecessary symlink. This optional, one-time procedure is recommended if it applies to your installation.
- Check to see if you need to perform this step:
[root@rsv ~]$ ls -l /usr/share/rsv/www
www is a symlink to
/var/www/html/rsv, then continue with the procedure; if not, then you are done!
- Stop (only) the RSV service using the instructions below
- Verify that RSV is not running:
[root@rsv ~]$ condor_cron_q
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
- Remove the symbolic link:
[root@rsv ~]$ unlink /usr/share/rsv/www
- Move the formerly linked directory into place:
[root@rsv ~]$ mv -f /var/www/html/rsv /usr/share/rsv/www
- Make sure that all RSV files are owned by RSV:
[root@rsv ~]$ chown -R rsv:rsv /usr/share/rsv/www
- Restart the RSV service using the instructions below
This procedure can be done before or after upgrading the
package to version 1.1.2 or later.
After installation, there are some one-time configuration steps to tell RSV how to operate at your site.
/etc/osg/config.d/30-rsv.ini and follow the instructions in the file. There are detailed comments for each setting. In the simplest case — to monitor only your CE — set the
htcondor_ce_hosts variable (or
gram_ce_hosts for a GRAM CE) to the fully qualified hostname of your CE. For a sample
rsv.ini file, see the complete installation output below.
If you have installed HTCondor already but not by RPM, specify the location of the Condor installation in
30-rsv.ini in the
condor_location setting. If an HTCondor RPM is installed, you do not need to set
Complete the configuration using the
[root@rsv ~]$ osg-configure -v
[root@rsv ~]$ osg-configure -c
osg-configure tool produces a lot of output; see below for an example.
The following configuration steps are optional and will likely not be required for setting up a small or typical site. If you do not need any of the following special configurations, skip to the section on using RSV
Generally speaking, read the ConfigureRsv
page for more advanced configuration options. Or see below
for specific advanced configuration scenarios.
Configuring RSV to run probes using a remote server
RSV monitors systems by running probes, which can run on the RSV host itself (the default case), via a separate batch system like HTCondor, or via a remote batch system using a Globus gatekeeper and its job manager. The last two options both can count those jobs and report them to, for example, Gratia.
In this case, remember to:
- Add the RSV user
rsv on all the systems where the probes may run, and
- Map the RSV service certificate to the user you intend to use for RSV. This should be a local user used exclusively for RSV and not belonging to an institutional VO to avoid for the RSV probes to be accounted as regular VO jobs in Gratia. This can be done in GUMS or using a grid-mapfile-local (if you use a grid-mapfile). MapServiceCertToRsvUser explains how to configure GUMS or the grid-mapfile. Also see the CE installation document for more information.
Configuring the RSV web server to use HTTPS instead of HTTP
If you would like your local RSV web server to use HTTPS instead of the default HTTP (for compatibility or security reasons), complete the steps below. This procedure assumes that you already have an HTTP service certificate (or a copy of the host certificate) in
. If not, omit the
modifications below, and your web server will start with its own, self-signed certificate.
[root@rsv ~]$ yum install mod_ssl
- Make an alternate set of HTTP service certificate files:
[root@rsv ~]$ cp -p /etc/grid-security/http/httpcert.pem /etc/grid-security/http/httpcert2.pem
[root@rsv ~]$ cp -p /etc/grid-security/http/httpkey.pem /etc/grid-security/http/httpkey2.pem
[root@rsv ~]$ chown apache:apache /etc/grid-security/http/http*2.pem
- Back up existing Apache configuration files:
[root@rsv ~]$ cp -p /etc/httpd/conf/httpd.conf /etc/httpd/conf/httpd.conf.orig
[root@rsv ~]$ cp -p /etc/httpd/conf.d/ssl.conf /etc/httpd/conf.d/ssl.conf.orig
- Change the default port for HTTP connections to 8000 by editing
- Set up HTTPS access by editing
After these changes, when you start the Apache service, it will listening on ports
(for HTTP) and
(for HTTPS), rather than the default port
(for HTTP only).
- If you make the changes above, you must restart the Apache server after each CA certificate update to pick up the changes.
Managing RSV and associated services
In addition to the RSV service itself, there are a number of supporting services in your installation. The specific services are:
|| Service name
| Fetch CRL
|| On EL 6:
On EL 5:
| See CA documentation for more info
Start the services in the order listed and stop them in reverse order. As a reminder, here are common service commands (all run as
| To …
|| Run the command …
| Start a service
service SERVICE-NAME start
| Stop a service
service SERVICE-NAME stop
| Enable a service to start during boot
chkconfig SERVICE-NAME on
| Disable a service from starting during boot
chkconfig SERVICE-NAME off
Running RSV manually
Normally, the HTCondor-Cron scheduler runs RSV periodically. However, you can run RSV probes manually at any time:
[root@rsv ~]$ rsv-control --run --all-enabled
If successful, results will be available from your local RSV web server (e.g., http://localhost/rsv
) and, if enabled (which is the default) on MyOSG
You can also run the metrics individually or pass special parameters as explained in the rsv-control document
You can find more information on troubleshooting RSV in the rsv-control documentation
and in TroubleshootRSV
Important file locations
Logs and configuration:
| File Description
| Initial configuration
| Read by
| RSV configuration
| Generally files in this directory should not be edited directly. Use
| Metric configuration
| To change arguments and environment
To find the metrics and the other files in RSV you can use also the RPM commands:
rpm -ql rsv-metrics
rpm -ql rsv
Getting more information from rsv-control
The first step to getting more information is to run
with more verbosity. Use the
) flag. This flag can be used with any of rsv-control's abilities (run, enable, list, etc). The verbosity levels are:
- 0 = print nothing
- 1 = print warnings and errors along with usual output of command being run (1 is the default level)
- 2 = adds informational messages
- 3 = full debugging output
For example, here is the output when running a metric with -v2.
[root@fermicloud016 condor]# rsv-control -r org.osg.general.osg-version -v 2 -u osg-edu.cs.wisc.edu
INFO: Reading configuration file /etc/rsv/rsv.conf
INFO: Reading configuration file /etc/rsv/consumers.conf
INFO: Validating configuration:
INFO: Validating user:
INFO: Invoked as root. Switching to 'rsv' user (uid: 100 - gid: 102)
INFO: Registered consumers: html-consumer, gratia-consumer
INFO: Loading config file '/etc/rsv/meta/metrics/org.osg.general.osg-version.meta'
INFO: Loading config file '/etc/rsv/metrics/org.osg.general.osg-version.conf'
INFO: Optional config file '/etc/rsv/metrics/osg-edu.cs.wisc.edu/org.osg.general.osg-version.conf' does not exist
INFO: Checking proxy:
INFO: Using service certificate proxy
INFO: Running command with timeout (1200 seconds):
/usr/bin/openssl x509 -in /tmp/rsvproxy -noout -enddate -checkend 21600
INFO: Exit code of job: 0
INFO: Service certificate valid for at least 6 hours.
INFO: Pinging host osg-edu.cs.wisc.edu:
INFO: Running command with timeout (1200 seconds):
/bin/ping -W 3 -c 1 osg-edu.cs.wisc.edu
INFO: Exit code of job: 0
INFO: Ping successful
Running metric org.osg.general.osg-version:
INFO: Executing job remotely using Condor-G
INFO: Setting up job environment:
INFO: No environment setup declared
INFO: Condor-G working directory: /var/tmp/rsv/condor_g-JiQthF
INFO: Forming arguments:
INFO: Arguments: ''
INFO: List of files to transfer: /usr/libexec/rsv/probes/RSVMetric.pm
INFO: Condor submission: Submitting job(s).
1 job(s) submitted to cluster 2.
INFO: Trimming data to 10000 bytes because details-data-trim-length is set
INFO: Creating record for html-consumer consumer at '/var/spool/rsv/html-consumer/org.osg.general.osg-version.7rgLfn'
INFO: Creating record for gratia-consumer consumer at '/var/spool/rsv/gratia-consumer/org.osg.general.osg-version.-qelnL'
timestamp: 2012-01-25 16:12:40 CST
detailsData: OSG 1.2.26
Using the RSV verify tool
flag will run some basic checks for your RSV installation.
$> rsv-control --verify
Testing if Condor-Cron is running...
Testing if metrics are running...
OK (98 running metrics)
Testing if consumers are running...
OK (1 running consumers)
Checking which consumers are configured...
The following consumers are enabled: html-consumer
WARNING: The gratia-consumer is not enabled. This indicates that your
resource is not reporting to OSG.
This tool is still under development and it does only basic checks, but it is a good first step when debugging issues.
To get assistance, please use this page
RSV has a tool to collect information useful for troubleshooting. For most problems the RSV support team will ask you to generate a profile tarball to share information (including log and configuration files).
You can save some time by doing so with your original request:
$> rsv-control --profile
Running the rsv-profiler...
Making tarball (rsv-profiler.tar.gz)
Here are some other RSV documents that might be helpful:
The RSV installation will create two users unless they are already created. The users are created when the
packages are installed.
| Runs the RSV tests; the RSV certificate (below) will need to be owned by this user
| Runs the Condor Cron processes to schedule the running of the tests
Note that if you pre-create the RSV user, it should have a working shell. That is, it shouldn't have a default shell of
If you manage your
file with configuration management software such as Puppet, CFEngine or 411, make sure the UID and GID in
matches the UID and GID of the
user and group in
Ensure an RSV service certificate is installed in
and the certificate files are owned by the
user. Adjust the permissions if necessary (cert needs to be readable by all, key needs to be readable by nobody but owner).
You may need another certificate owned by
if you'd like an authenticated web server; see Configuring the RSV web server to use HTTPS instead of HTTP
to request a service certificate.
For more details on overall Firewall configuration, please see our Firewall documentation
| Service Name
|| Port Number
|| RSV runs an HTTP server (Apache) that publishes a page with the RSV testing results
|| RSV pushes testing results to the OSG Gratia Collectors at opensciencegrid.org
|| Allow outbound network connection to all services that you want to test
Or, if you'd rather have your RSV web page appear as
like it used to in OSG 1.2, the first column above would be HTTPS
. See above
for how to configure this.