RSV Reporting Overview
This page is an overview of the RSV reporting effort. This effort is a combination of work from the WLCG, OSG GOC, and the OSG Metrics and Measurements group.
RSV reporting is an effort to meet the following goals:
- Help site administrators understand the test results from their site.
- Provide the OSG executive team with information about the OSG facility's status.
- Compare and align measurements made by the OSG and those made by the WLCG
The mechanism by which we implement these goals is a set of daily reports published via email (text-only) and archived online (HTML)
. These email reports combine information from WLCG, the RSV database, and configuration files. The process for installing the reports is documented here
. These are run automatically by the GOC.
One of the key algorithms we use to evaluate sites is the WLCG availability algorithm. Based on the RSV data, a list of availability metrics for each service
, and downtime information from OIM
, the WLCG availability algorithm produces two numbers for a given time interval: the availability and reliability.
The availability is designed to measure the percentage of time that the entity was functioning for users; the reliability is designed to measure the percentage of time that the entity was functioning out of the scheduled availability
(i.e., the times when the entity wasn't under maintenance). Here, an "entity" can be one of three things:
- Service: A set of functionalities provided to an end-user by a resource.
- Site: A logical set of resources providing various services (most often, a combination of a SE and CE).
- Federation: A single site or group of sites even which have entered in the WLCG MoU as a single entity.