You are here: TWiki > Tier3 Web>CondorSharedInstall (22 Feb 2012, KyleGross)

Condor Shared Install

by MarcoMambelli


on We recommend to use the RPM distribution of Condor. Installation is more standard and automatic. Reasons to use this installation based on the TAR distribution may include: your platform is not supported by the current RPM release, or if you desire multiple version of Condor installed and selectable at the same time. on If you are using Rocks, commonly used by CMS, or you install using Kickstart files like the one provided by ATLAS, then you may not need any of this. The may install and setup Condor for you. Check your VO documentation first.

We will use the latest stable release of Condor. As of May 20, 2010, this is 7.4.2. Our approach will be to install Condor in /opt/condor on the management node (gc1-ce) and share it with the other nodes in the cluster. We will also prepare the host-specific configuration files for all nodes participating in the Condor pool and store them in a standard location /nfs/condor/condor-etc. This will make it easy to make cluster-wide configuration changes.

Some useful links:

A note about the directory structure

This Condor installation was structured to facilitate upgrades to Condor with minimal effort. The Condor release directory on each host is /opt/condor which is a soft link to the NFS exported /nfs/condor/condor which in turn is a link to the actual Condor installation directory which includes the version number: /nfs/condor/condor-7.4.2. This allows to have multiple releases installed in subdirectories of /nfs/condor/. Some configuration files are kept outside in the /nfs/condor/condor-etc directory. The motivation for this choice will hopefully become more clear in the Upgrades section below.

A note about Condor configuration files

Condor is highly configurable. All parameters have default values (described in the Condor manual) and can be overridden by the content of a chain of configuration files (condor_config). Each of this files may contain the entry LOCAL_CONFIG_FILE that can point to the following file in the chain. This allows the specification of multiple files as the local configuration file, each one processed in the order given (with parameters set in later files overriding values from previous files). In our configuration we'll use three levels of configuration:
  • condor_config - provided by the Condor release and left mostly unmodified, this is in the release directory
  • condor_config.cluster - modification common to all the machines in the cluster, this is in a shared configuration directory
  • condor_config.$(HOSTNAME) - files specific for one node (or one type of nodes, e.g. all the worker nodes), these are in the shared configuration directory as well
The Condor manual provides a complete guide on Condor configuration.

Management node installation

Install Condor on the machine that will be the central manager. In this tutorial use gc1-ce, the same host as the OSG compute element.

If the /nfs/condor directory is exported with root squash from a NFS server that is not also the central manager you'll have to adapt the following instructions. You have to edit the configuration files in /nfs/condor/condor-etc/ from the NFS server. Furthermore you have two options for the installation:

  1. you disable root squash for the installation (and any time you perform Condor upgrades).
  2. you perform installation (and upgrades) on the NFS server, then you remove the local Condor directories from it and you create them on the central manager (the same way that you do on the other nodes).

  • Go to the download page and choose the Condor version (e.g. 7.4.2, current production version).
  • Choose the correct distribution for your platform (the one below condor 7.4.2 for RHEL 5 dynamically linked ):
    • condor-7.4.2-linux-x86-rhel5-dynamic.tar.gz (for x86 OS)
    • condor-7.4.2-linux-x86_64-rhel5-dynamic.tar.gz (for x86_64 OS)
  • Save the downloaded file to your home directory.
  • Create the machine-specific directories that Condor uses:
    mkdir /scratch
    mkdir /scratch/condor

  • Make sure the condor user exists with a shared home directory (/home/condor) (See the instructions below if this is not possible).

  • Create the shared directory for configuration files:
    mkdir /nfs/condor/condor-etc

  • Prepare the target installation directory:
    cd /nfs/condor
    mkdir condor-7.4.2
    ln -s condor-7.4.2 condor

  • Make sure that /opt/condor is a soft link to /nfs/condor/condor.
    ln -s /nfs/condor/condor /opt/condor

  • Create a temporary directory and extract the source from the downloaded file in your home directory
    mkdir /tmp/condor-src
    cd /tmp/condor-src
    tar xvzf ~/condor-7.4.2-linux-x86-rhel5-dynamic.tar.gz 

  • Run the Condor installation script:
    cd condor-7.4.2
    ./condor_install --prefix=/opt/condor --local-dir=/scratch/condor --type=manager
  • The installation script will place the machine-specific configuration file in the the directory specified by local-dir. We want to manage these files in the shared area, so:
    mv /scratch/condor/condor_config.local /opt/condor/etc/
  • Open the global Condor configuration file /opt/condor/etc/condor_config in an editor.
    • Change the value of this variable (find it in the file):
      LOCAL_CONFIG_FILE = /opt/condor/etc/condor_config.local
  • Open the local Condor configuration file /opt/condor/etc/condor_config.local in an editor.
    • On top add this line:
      LOCAL_CONFIG_FILE = /nfs/condor/condor-etc/condor_config.cluster
  • Optionally you can copy the important values (RELEASE_DIR, MAIL, CONDOR_IDS, LOCK, JAVA, JAVA_MAXHEAP_ARGUMENT) from condor_config.local into condor_config.cluster and have the global Condor configuration file point directly to the second one. Inspect the file to make sure that no important setting is skipped.
  • Edit the cluster Condor configuration file /nfs/condor/condor-etc/condor_config.cluster with the following content:
    • Copy the following content (attached in [[][condor_config.cluster]]) changing the values to suite your cluster (, You may find some suggestion in the local configuration file /scratch/condor/condor_config.local:
      ## Condor configuration for OSG T3
      ## For more detial please see
      LOCAL_CONFIG_FILE = /nfs/condor/condor-etc/condor_config.$(HOSTNAME)
      LOCAL_DIR = /scratch/condor
      # The following should be your T3 domain
      UID_DOMAIN =
      # Human readable name for your Condor pool
      COLLECTOR_NAME = "Tier 3 Condor at $(UID_DOMAIN)"
      # A shared file system (NFS), e.g. job dir, is assumed if the name is the same
      # The following should be the full name of the head node
      # Port range should be opened in the firewall (can be different on different machines)
      # This 9000-9999 is coherent with the iptables configuration in the T3 documentation 
      IN_HIGHPORT = 9999
      IN_LOWPORT = 9000
      # This is to enforce password authentication
      SEC_CLIENT_AUTHENTICATION_METHODS = password,fs,gsi,kerberos
      SEC_PASSWORD_FILE = /scratch/condor/condor_credential
      ALLOW_DAEMON = condor_pool@*
      ##  Sets how often the condor_negotiator starts a negotiation cycle 
      ##  for negotiator and schedd). 
      #  It is defined in seconds and defaults to 60 (1 minute), default is 300. 
      ##  Scheduling parameters for the startd
      # start as available and do not suspend, preempt or kill
      START = TRUE
      KILL = FALSE
    • Make sure that you have the following important line in the file
      CONDOR_HOST = gc1-hn
      (in red in the example above).
      • Note: CONDOR_HOST can be set with or without the domain name: gc1-hn or
      • Note: CONDOR_HOST is the node running the Condor collector and negotiator. If you are running them on separate nodes, define both NEGOTIATOR_HOST and COLLECTOR_HOST and set each with the correct host name.
  • Link condor_config in the condor user's home directory to the location of the global configuration file (this allows condor to find the file without requiring environment variables):
    ln -s /opt/condor/etc/condor_config ~condor/condor_config
  • Create the files with the host configuration specific for the nodes using the following content. We will create 3 base configuration files: one for the headnode, one for worker nodes, one for the interactive nodes (user interface). (specific for the headnode) copying the following line:
    • For the headnode, /nfs/condor/condor-etc/condor_config.headnode:
      ## OSG T3 host configuration
      ## For more info:
      # List of daemons on the node (headnode requires collector and negotiator, 
      # schedd required to submit jobs, startd to run jobs)
    • For the worker nodes, /nfs/condor/condor-etc/condor_config.worker:
      ## OSG T3 host configuration
      ## For more info:
      # List of daemons on the node (headnode requires collector and negotiator, 
      # schedd required to submit jobs, startd to run jobs)
    • For the interactive nodes, /nfs/condor/condor-etc/condor_config.interactive:
      ## OSG T3 host configuration
      ## For more info:
      # List of daemons on the node (headnode requires collector and negotiator, 
      # schedd required to submit jobs, startd to run jobs)
  • Then for each node create a link pointing to the template, e.g.:
    cd /nfs/condor/condor-etc/
    ln -s condor_config.headnode condor_config.gc1-hn
    ln -s condor_config.interactive condor_config.gc1-ui001
    ln -s condor_config.worker condor_config.gc1-wn001
    ln -s condor_config.worker condor_config.gc1-wn002
    ln -s condor_config.worker condor_config.gc1-wn003
    Each node must have its own condor_config.${HOST} file. If some nodes require a special configuration you can copy the template (e.g. condor_config.worker) and customize it.
  • If you are not installing Condor on the headnode (e.g. on the NFS server), stop here, login on the headnode end follow the instructions in the next section. On the headnode, setup Condor and set the password that will be used by the Condor system (at the prompt enter the same password for all nodes):
    source /opt/condor/
    condor_store_cred -c add
  • Start up the Condor master and check for running processes
    ps -ef |grep condor
    condor    4404     1  0 06:42 ?        00:00:00 /opt/condor/sbin/condor_master
    condor    4405  4404  0 06:42 ?        00:00:00 condor_collector -f
    condor    4406  4404  1 06:43 ?        00:00:00 condor_negotiator -f
    root      4410  4348  0 06:43 pts/2    00:00:00 grep condor
  • To shutdown
    /opt/condor/sbin/condor_off -master
  • Enable automatic startup at boot by setting the correct path of the condor_master executable (MASTER=/opt/condor/sbin/condor_master) in the boot file /opt/condor/etc/examples/condor.boot. Then:
    cp /opt/condor/etc/examples/condor.boot /etc/init.d/condor
    chkconfig --level 235 condor on
  • Add Condor commands to the user environment (if you like to have condor commands in the path):
    cp /opt/condor/ /opt/condor/condor.csh /etc/profile.d/

Setting up the other nodes

This should be executed on all other nodes where Condor should be running (worker nodes, submit host)
  • Link the installation directory and create the local Condor directories in /scratch/condor (making sure that execute directory has the right permissions a+rwx +t)
    ln -s /nfs/condor/condor /opt/condor
    mkdir /scratch
    mkdir /scratch/condor
    mkdir /scratch/condor/execute
    mkdir /scratch/condor/log
    mkdir /scratch/condor/spool
    chown condor /scratch/condor/*
    chmod a+rwx /scratch/condor/execute
    chmod +t /scratch/condor/execute
  • At the end you should see something like:
    ls -l /scratch/condor/
    total 12
    drwxrwxrwt 2 condor root 4096 Oct 15 04:07 execute
    drwxr-xr-x 2 condor root 4096 Oct 15 04:15 log
    drwxr-xr-x 2 condor root 4096 Oct 15 04:02 spool
  • Setup Condor and set the password that will be used by the Condor system (at the prompt enter the same password for all nodes):
    source /opt/condor/
    condor_store_cred -c add
  • Enable automatic startup:
    cp /opt/condor/etc/examples/condor.boot /etc/init.d/condor
    chkconfig --level 235 condor on
  • Add Condor commands to the user environment (if you like to have condor commands in the path):
    cp /opt/condor/ /opt/condor/condor.csh /etc/profile.d/
  • Start Condor and check the processes:
    /etc/init.d/condor start
    # should reply: Starting up Condor
    ps -ef |grep condor
    # you should see condor_master and the desired daemons (depending on the host)


As explained in the section about the directory structure, this Condor installation was structured to facilitate upgrades with minimal effort. The Condor directory on each host is /opt/condor. This is a link to the exported /nfs/condor/condor that is a link to the real Condor installation directory that has also the version number in the name (e.g. /nfs/condor/condor-7.4.2).

Only one version of Condor will be used at the time but linking /nfs/condor/condor to a new directory, e.g. /nfs/condor/condor-7.3.2 allows to install that version (similarly to what done above), test it and be able to revert back to the previous one if desired simply changing one link (e.g. if the new installation is not working properly).


source /opt/condor/
will add the condor commands to your path and set the CONDOR_LOCATION variable. If you copied the and condor.csh files in /etc/profile.d/ as described above, Condor will be in the environment of every user automatically, no need to source any file.

Testing the installation

You can see the resources in your Condor cluster using condor_status and submit test jobs with condor_submit. Check CondorTest for more.

Special needs

The following sections present instructions or suggestion for uncommon configurations

Changes to the Firewall (IPtables)

If you are using a Firewall (e.g. iptables) on all nodes you need to open the ports used by Condor:
  • Edit the /etc/sysconfig/iptables file to add these lines ahead of the reject line:
    -A RH-Firewall-1-INPUT  -s <network_address> -m state --state ESTABLISHED,NEW -p tcp -m tcp --dport 9000:10000 -j ACCEPT  
    -A RH-Firewall-1-INPUT  -s <network_address> -m state --state ESTABLISHED,NEW -p udp -m udp --dport 9000:10000 -j ACCEPT 
    where the network_address is the address of the intranet of the T3 cluster e.g. (Or the extranet if your T3 does not have a separate intranet). You can omit the -s option if you have nodes of your Condor cluster (startd, schedd, ...) outside of that network.
  • Restart the firewall:
    /etc/init.d/iptables restart

Swap configuration on the Condor submit host

If you have no swap space on the submit host all your job submissions will fail (remain Idle) and in /scratch/condor/log/SchedLog you will see errors like "Swap space estimate reached! No more jobs can be run!".

You can check your swap memory with cat /proc/meminfo |grep Swap and you will see something like:

SwapCached:          0 kB
SwapTotal:     2097144 kB
SwapFree:      2097080 kB
If the numbers above are all 0, you have no swap space.

The recommendation is to enable a swap space on your host. Here some information to add a swap file or a swap-howto, see section 5. If you cannot, here is a workaround in Condor to avoid to use swap space. It will work for small test clusters.

In the file condor_config.gc1-ui file add the line:


Multiple Condor version at the same time

The Upgrades section above shows an easy way to deal with upgrades but there may be the need to run multiple version of Condor at the same time, e.g. if the cluster has different OS that require a different binary version of Condor. If your cluster is heterogeneous (nodes are different) the RPM installation may be easier.

To have multiple versions of Condor in a shared installation:

  • download and install all the versions in subdirectories of =/nfs/condor/=
    mkdir /nfs/condor/condor-versionX
    mkdir /tmp/install
    tar xvzf 
    cd condor-versionX
    ./condor_install --prefix=/nfs/condor/condor-versionX --local-dir=/tmp/scratch-versionX --type=manager
  • on each node link the correct version to /opt/condor:
    ln -s /nfs/condor/condor-versionX /opt/condor
  • on each node fix the file /opt/condor/etc/condor_config as described above
  • other instruction are similar:
    • prepare the same way the shared configuration files
    • create the local condor directories
    • fix the automatic startup

No shared user directories

Condor is assuming that the directories used to submit jobs (e.g. the home directories) are shared among all the nodes that share the same string for =FILESYSTEM_DOMAIN =. If this is not the case for you, you have to:
  • change the FILESYSTEM_DOMAIN in the cluster or host configuration file so that it is different for all the nodes, e.g. FILESYSTEM_DOMAIN = $(HOSTNAME)
  • when you submit a job, tell Condor to transfer the files: the executable and the input and output files. If files cannot be moved, the job will not go to a node with a FILESYSTEM_DOMAIN different from the submit host.

Complete: 2
Responsible: MarcoMambelli - 17 Nov 2009
Reviewer - date:


Topic attachments
I Attachment Action Size Date Who Comment
txttxt manage 0.5 K 17 Nov 2009 - 18:34 MarcoMambelli perl script
Topic revision: r26 - 22 Feb 2012 - 16:31:22 - KyleGross

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..