HTCondor/GlideinWMS Probe Configuration

Overview

This probe reports the usage in a HTCondor system

This probe will read the GLIDEIN_ResourceType startd attribute to identify on which Resource the job ran. This attribute needs to be set by the GlideinWMS Factory (The OSG factories already have this set). If this attribute does not exist, the probe will fall back to GLIDEIN_Site, and finally the FileSystemDomain of the host.

The usage will be forwarded to the production Gratia collector (gratia-osg-prod.opensciencegrid.org) by default.

This document refers to a probe installed to a Submit host submitting directly or indirectly (flocking) to a GWMS System (GWMS Frontend, Submit host, ...). If you are installing/configuring a Compute Element check the CE install guide instead. The underlying probe is the same but the configuration differs.

Requirements and Preparation

The Probe assumes that condor binaries (condor_history for example) are in the default path (/usr/bin). If they are not, please see section Non-Standard Condor

Host requirements (bare-bone installation):

  • Probe must be installed on the VO's schedd.
  • Currently most of our testing has been done on Scientific Linux 5.
  • Root access
  • Allow outbound network connection, on port 80, to the Gratia collector (default is http://gratia-osg-prod.opensciencegrid.org). Gratia does not support HTTP proxies.

To be part of OSG your Submit host must be registered in OIM (e.g. as Submit Node). To register your resource:

Install the Yum Repositories required by OSG

The OSG RPMs currently support Red Hat Enterprise Linux 5, 32 and 64 bit and variants (Scientific Linux 5 and CentOS) .

OSG RPMs are distributed via the OSG yum repositories. Some packages depend on packages distributed via the EPEL repositories. So both repositories must be enabled.

Install EPEL

  • Install the EPEL repository, if not already present. Note: This enables EPEL by default. Choose the right version to match your OS version.
    # EPEL 5 (For RHEL 5, Documentation/Release3.CentOS 5, and SL 5) 
    [root@client ~]$ curl -O https://dl.fedoraproject.org/pub/epel/epel-release-latest-5.noarch.rpm
    [root@client ~]$ rpm -Uvh epel-release-latest-5.noarch.rpm
    # EPEL 6 (For RHEL 6, Documentation/Release3.CentOS 6, and SL 6) 
    [root@client ~]$ rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
    # EPEL 7 (For RHEL 7, Documentation/Release3.CentOS 7, and SL 7) 
    [root@client ~]$ rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
    WARNING: if you have your own mirror or configuration of the EPEL repository, you MUST verify that the OSG repository has a better yum priority than EPEL (details). Otherwise, you will have strange dependency resolution (depsolving) issues.

Install the Yum priorities package

For packages that exist in both OSG and EPEL repositories, it is important to prefer the OSG ones or else OSG software installs may fail. Installing the Yum priorities package enables the repository priority system to work.

  1. Choose the correct package name based on your operating systemís major version:

    • For EL 5 systems, use yum-priorities
    • For EL 6 and EL 7 systems, use yum-plugin-priorities
  2. Install the Yum priorities package:

    [root@client ~]$ yum install PACKAGE

    Replace PACKAGE with the package name from the previous step.

  3. Ensure that /etc/yum.conf has the following line in the [main] section (particularly when using ROCKS), thereby enabling Yum plugins, including the priorities one:

    plugins=1
    NOTE: If you do not have a required key you can force the installation using --nogpgcheck; e.g., yum install --nogpgcheck yum-priorities.

Install OSG Repositories

  1. If you are upgrading from OSG 3.1 (or 3.2) to OSG 3.2 (or 3.3), remove the old OSG repository definition files and clean the Yum cache:

    [root@client ~]$ yum clean all
    [root@client ~]$ rpm -e osg-release

    This step ensures that local changes to *.repo files will not block the installation of the new OSG repositories. After this step, *.repo files that have been changed will exist in /etc/yum.repos.d/ with the *.rpmsave extension. After installing the new OSG repositories (the next step) you may want to apply any changes made in the *.rpmsave files to the new *.repo files.

  2. Install the OSG repositories using one of the following methods depending on your EL version:

    1. For EL versions greater than EL5, install the files directly from repo.grid.iu.edu:

      [root@client ~]$ rpm -Uvh URL

      Where URL is one of the following:

      Series EL6 URL (for RHEL 6, CentOS 6, or SL 6) EL7 URL (for RHEL 7, CentOS 7, or SL 7)
      OSG 3.2 https://repo.grid.iu.edu/osg/3.2/osg-3.2-el6-release-latest.rpm N/A
      OSG 3.3 https://repo.grid.iu.edu/osg/3.3/osg-3.3-el6-release-latest.rpm https://repo.grid.iu.edu/osg/3.3/osg-3.3-el7-release-latest.rpm
    2. For EL5, download the repo file and install it using the following:

      [root@client ~]$ curl -O https://repo.grid.iu.edu/osg/3.2/osg-3.2-el5-release-latest.rpm
      [root@client ~]$ rpm -Uvh osg-3.2-el5-release-latest.rpm

For more details, please see our yum repository documentation.

Installation

  1. Install the GlideinWMS Gratia Probe:
    [root@client ~]$ yum install gratia-probe-glideinwms 
  2. Edit the ProbeConfig located in /etc/gratia/condor/ProbeConfig. First, edit the SiteName and ProbeName to be a unique identifier for your GlideinWMS Submit host. There can be multiple probes (with different names) per site. If you haven't already, you should register your GlideinWMS submit host in OIM. Then you can use the name you used to register the resource.
    ProbeName="condor:<hostname>"
    SiteName="HCC-GlideinWMW-Frontend"   
    Next, turn the probe on by editing the EnableProbe:
    EnableProbe="1"   
  3. Reconfigure HTCondor:
    [root@client ~]$ condor_reconfig
  4. Start the services, and add them to be started automatically when the system reboots:
    $ service gratia-probes-cron start
    $ chkconfig --level 345 gratia-probes-cron on 

Usage Graphs

Usage graphs can be found under the heading Glidein Bar Graphs at http://gratiaweb.grid.iu.edu/gratia/xml

Specifically, this page may be useful. Replace the variable probe with the name of your probe that you configured above.

NOTE: Usage could be delayed up to a few hours after the job has completed.

Unusual Use Cases

Users without Certificates

If you have users that submit jobs without a certificate explicitly declared in the submit file, you will need to add MapUnknownToGroup to the ProbeConfig. In the file /etc/gratia/condor/ProbeConfig, add the value after the EnableProbe.

    ...
    SuppressGridLocalRecords="0"
    EnableProbe="1"
    MapUnknownToGroup="1"

    Title3="Tuning parameter"
    ...

Further, if you want to record all usage as coming from a single VO, you can configure the probe to override the 'guessed' VO. In the below example, replace the Engage with a registered VO that you would like to report as. If you don't have a VO that you are affiliated with, you may use Engage.

...
    MapUnknownToGroup="1"
    MapGroupToRole="1"
    VOOverride="Engage"
...

Non-Standard Condor Install

If Condor is installed in a non-standard location (ie not RPMs, or relocated RPM outside /usr/bin), then you need to tell the probe where to find the Condor binaries. This can be done with a script with a special attribute in /etc/gratia/condor/ProbeConfig, CondorLocation. Point it to the location of the Condor install, such that CondorLocation/bin/condor_version exists.

No RPM install of Condor

By default, the gratia probe will also install Condor, if it hasn't already been installed by RPM (or YUM). To disable this, you need to install empty-condor in order to trick yum into believing you already have condor.
yum install empty-condor gratia-probe-condor 

New Data Directory

In /etc/gratia/condor/ProbeConfig, the value of DataFolder (near the bottom) needs to be the same as the Condor configuration variable PER_JOB_HISTORY_DIR. You can get the value of PER_JOB_HISTORY_DIR with the command:
condor_config_val PER_JOB_HISTORY_DIR 
.

Different collector and other customizations

By default the probe reposts to the OSG Gratia service (Collector). To change that you must edit the configuration file, /etc/gratia/condor/ProbeConfig, and replace the OSG production host with your desired one:
...
    CollectorHost="gratia-osg-prod.opensciencegrid.org:80"
    SSLHost="gratia-osg-prod.opensciencegrid.org:443"
    SSLRegistrationHost="gratia-osg-prod.opensciencegrid.org:80"
...

You can find more information about the content of the configuration file and how to change it in ProbeConfig and ProbeConfigCondor .

References

Some links about OSG Accounting with Gratia: The GlideinWMS

Topic revision: r29 - 06 Dec 2016 - 18:12:35 - KyleGross
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..