HTCondor/GlideinWMS Probe Configuration

Overview

This probe reports the usage in a HTCondor system

This probe will read the GLIDEIN_ResourceType startd attribute to identify on which Resource the job ran. This attribute needs to be set by the GlideinWMS Factory (The OSG factories already have this set). If this attribute does not exist, the probe will fall back to GLIDEIN_Site, and finally the FileSystemDomain of the host.

The usage will be forwarded to the production Gratia collector (gratia-osg-prod.opensciencegrid.org) by default.

This document refers to a probe installed to a Submit host submitting directly or indirectly (flocking) to a GWMS System (GWMS Frontend, Submit host, ...). If you are installing/configuring a Compute Element check the CE install guide instead. The underlying probe is the same but the configuration differs.

Requirements and Preparation

The Probe assumes that condor binaries (condor_history for example) are in the default path (/usr/bin). If they are not, please see section Non-Standard Condor

Host requirements (bare-bone installation):

  • Probe must be installed on the VO's schedd.
  • Currently most of our testing has been done on Scientific Linux 5.
  • Root access
  • Allow outbound network connection, on port 80, to the Gratia collector (default is http://gratia-osg-prod.opensciencegrid.org). Gratia does not support HTTP proxies.

To be part of OSG your Submit host must be registered in OIM (e.g. as Submit Node). To register your resource:

Installation

  1. Install the GlideinWMS Gratia Probe:
    [root@client ~]$ yum install gratia-probe-glideinwms 
  2. Edit the ProbeConfig located in /etc/gratia/condor/ProbeConfig. First, edit the SiteName and ProbeName to be a unique identifier for your GlideinWMS Submit host. There can be multiple probes (with different names) per site. If you haven't already, you should register your GlideinWMS submit host in OIM. Then you can use the name you used to register the resource.
    ProbeName="condor:<hostname>"
    SiteName="HCC-GlideinWMW-Frontend"   
    Next, turn the probe on by editing the EnableProbe:
    EnableProbe="1"   
  3. Reconfigure HTCondor:
    [root@client ~]$ condor_reconfig
  4. Start the services, and add them to be started automatically when the system reboots:
    $ service gratia-probes-cron start
    $ chkconfig --level 345 gratia-probes-cron on 

Usage Graphs

Usage graphs can be found under the heading Glidein Bar Graphs at http://gratiaweb.grid.iu.edu/gratia/xml

Specifically, this page may be useful. Replace the variable probe with the name of your probe that you configured above.

NOTE: Usage could be delayed up to a few hours after the job has completed.

Unusual Use Cases

Users without Certificates

If you have users that submit jobs without a certificate explicitly declared in the submit file, you will need to add MapUnknownToGroup to the ProbeConfig. In the file /etc/gratia/condor/ProbeConfig, add the value after the EnableProbe.

    ...
    SuppressGridLocalRecords="0"
    EnableProbe="1"
    MapUnknownToGroup="1"

    Title3="Tuning parameter"
    ...

Further, if you want to record all usage as coming from a single VO, you can configure the probe to override the 'guessed' VO. In the below example, replace the Engage with a registered VO that you would like to report as. If you don't have a VO that you are affiliated with, you may use Engage.

...
    MapUnknownToGroup="1"
    MapGroupToRole="1"
    VOOverride="Engage"
...

Non-Standard Condor Install

If Condor is installed in a non-standard location (ie not RPMs, or relocated RPM outside /usr/bin), then you need to tell the probe where to find the Condor binaries. This can be done with a script with a special attribute in /etc/gratia/condor/ProbeConfig, CondorLocation. Point it to the location of the Condor install, such that CondorLocation/bin/condor_version exists.

No RPM install of Condor

By default, the gratia probe will also install Condor, if it hasn't already been installed by RPM (or YUM). To disable this, you need to install empty-condor in order to trick yum into believing you already have condor.
yum install empty-condor gratia-probe-condor 

New Data Directory

In /etc/gratia/condor/ProbeConfig, the value of DataFolder (near the bottom) needs to be the same as the Condor configuration variable PER_JOB_HISTORY_DIR. You can get the value of PER_JOB_HISTORY_DIR with the command:
condor_config_val PER_JOB_HISTORY_DIR 
.

Different collector and other customizations

By default the probe reposts to the OSG Gratia service (Collector). To change that you must edit the configuration file, /etc/gratia/condor/ProbeConfig, and replace the OSG production host with your desired one:
...
    CollectorHost="gratia-osg-prod.opensciencegrid.org:80"
    SSLHost="gratia-osg-prod.opensciencegrid.org:443"
    SSLRegistrationHost="gratia-osg-prod.opensciencegrid.org:80"
...

You can find more information about the content of the configuration file and how to change it in ProbeConfig and ProbeConfigCondor .

References

Some links about OSG Accounting with Gratia: The GlideinWMS

Topic revision: r29 - 06 Dec 2016 - 18:12:35 - KyleGross
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..