RSV Troubleshooting Guide

About this Document

This document describes how to troubleshoot RSV v3 (part of OSG 3.0).

How to get Help?

To get assistance please use Help Procedure. The developers of a specific metric/probe may provide additional support channel, please check the specific troubleshooting document if your metric is listed in the table below.

Debugging errors from specific metrics

The RSV framework runs probes, programs that perform tests and can measure one or more performances, metrics. Each probe can be developed by different developers. Each metric can fail for different reasons and there may be precise suggestions to fix the specific problem. The following table will send you to the right page given the metric that you want to check.

Metrics Debugging help
org.osg.certificates.cacert-expiry cacert-verify-probe
org.osg.certificates.crl-expiry crl-freshness-probe
org.osg.gratia.condor
org.osg.gratia.gridftp-transfer
org.osg.gratia.hadoop-transfer
org.osg.gratia.lsf
org.osg.gratia.metric
org.osg.gratia.pbs
org.osg.gratia.sge
gratia-config-probe
org.osg.gip.consistency
org.osg.gip.lastrun
org.osg.gip.freshness
info-services-probe
org.osg.srm.srmcp-readwrite SRM (srmcp) read-write probe
org.osg.ress.ress-classad-exists ress-classad-exists-probe
org.osg.ress.cemon-containerkeyfile-CE-permissions cemon-containerkeyfile-ce-permissions-probe
org.osg.xroot.ping
org.osg.xroot.grid-xrdcp-direct
org.osg.xroot.grid-xrdcp-fax
org.osg.xroot.grid-xrdcp-compare
xroot-multi-probe

Resending failed Gratia records

If RSV fails to send Gratia records, it will save a copy of the output into /var/spool/rsv/failed-gratia-scripts. You will be notified if files are in this directory on your HTML status page.

If files appear here, you can attempt to determine why by looking at this log file: /var/log/rsv/consumers/gratia-consumer.output. This file is rotated, so the error message may no longer be present.

Usually this error is spurious - there may have been a problem with the central collector being unavailable, or there may have been a network problem. The first step to fix this problem is to try to resend these files. To do so, move them back into the gratia directory and they will be resent the next time the gratia-script-consumer runs (about every 5 minutes):

mv /var/spool/rsv/failed-gratia-records/* /var/spool/rsv/gratia-consumer/

If you continue to have problems, please contact the GOC.

References

RSV documents

Comments

Topic revision: r15 - 28 Sep 2017 - 18:27:12 - MatyasSelmeci
Hello, TWikiGuest!
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..