osg-ce.ligo.caltech.eduas an example which should not be used to run your tests! grid certificate is to authenticate yourself with every service provided by a Compute Element. The authentication process is based on the X.509 Public Key Infrastructure which consists of a private key (
userkey.pem) and a public key (
usercert.pem). Both files together form your certificate and are by default located in
[user@client ~]$ ls -al $HOME/.globus total 68 drwxr-x--- 4 user user 4096 Mar 18 18:50 . drwx------ 29 user user 12288 Jul 28 08:38 .. drwxrwxr-x 5 user user 4096 Jul 20 11:18 .gass_cache drwx------ 4 user user 4096 Mar 27 12:14 job -rw-r----- 1 user user 1724 Dec 16 2008 usercert.pem -r-------- 1 user user 1919 Dec 16 2008 userkey.pemYour certificate can be used to create a limited life-time Grid Proxy or a VOMS Proxy . You should use a VOMS Proxy if your membership Virtual Organization supports VOMS Proxies and is providing a VOMS Server. Some OSG sites may require the use of a VOMS Proxy and will reject authentication requests using Grid Proxies.
grid-proxy-infocan be used:
[user@client ~]$ grid-proxy-info subject : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994/CN=1692124231 issuer : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 identity : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 type : Proxy draft (pre-RFC) compliant impersonation proxy strength : 512 bits path : /tmp/x509up_u500 timeleft : 291:11:00 (12.1 days)Identity is sometimes also referred to as Grid Identity or more frequently as Distinguished Name. The last line of the output above indicates how long your current grid proxy will be valid. A grid proxy is said to have expired if there is not time left on it. To renew your grid proxy the program
grid-proxy-initis used together with your grid password:
[user@client ~]$ grid-proxy-init -valid 500:00 Your identity: /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 Enter GRID pass phrase for this identity: Creating proxy ................................. Done Your proxy is valid until: Tue Aug 18 08:06:35 2009
voms-proxy-infocan be used:
[user@client ~]$ voms-proxy-info -all WARNING: Unable to verify signature! Server certificate possibly not installed. Error: Cannot verify AC signature! subject : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994/CN=proxy issuer : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 identity : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 type : proxy strength : 1024 bits path : /tmp/x509up_u506 timeleft : 388:18:27 === VO LIGO extension information === VO : LIGO subject : /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 issuer : /DC=org/DC=doegrids/OU=Services/CN=voms.ligo.org attribute : /LIGO/Role=NULL/Capability=NULL timeleft : 0:00:00 uri : voms.phys.uwm.edu:15001Identity is sometimes also referred to as Grid Identity or more frequently as Distinguished Name. Note the display of the extended attributes The first line marked
timeleftof the output above indicates how long your current voms proxy will be valid. A voms proxy is said to have expired if there is not time left on it. To renew your voms proxy the program
voms-proxy-initis used together with your grid password:
[user@client ~]$ voms-proxy-init -voms LIGO:/LIGO -valid 500:00 Enter GRID pass phrase: Your identity: /DC=org/DC=doegrids/OU=People/CN=Firstname Lastname 392994 Creating temporary proxy ................................ Done Contacting voms.phys.uwm.edu:15001 [/DC=org/DC=doegrids/OU=Services/CN=voms.ligo.org] "LIGO" Done Creating proxy ......................................................... Done
voms-proxy-initto contact the VOMS server for the LIGO VO. This may be different for your VO. Locate a file named
vomseson your submission host to find out more about available VOMS servers.
[user@client ~]$ globusrun -a -r ce.opensciencegrid.org GRAM Authentication test successful
globus-url-copycommand can be used for that purpose. To verify that GridFTP is working correctly we will
[user@client ~]$ dd if=/dev/zero of=/tmp/ce.opensciencegrid.org.test.0 bs=1k count=1024 1024+0 records in 1024+0 records out 1048576 bytes (1.0 MB) copied, 0.005332 seconds, 197 MB/sNext let's transfer the file to the grid resource:
[user@client ~]$ globus-url-copy file:///tmp/ce.opensciencegrid.org.test.0 gsiftp://ce.opensciencegrid.org/~/ce.opensciencegrid.org.test.0 [user@client ~]$ echo $? 0To copy the file back to your resource use:
[user@client ~]$ globus-url-copy gsiftp://ce.opensciencegrid.org/~/ce.opensciencegrid.org.test.0 file:///tmp/ce.opensciencegrid.org.test.1 [user@client ~]$ echo $? 0Finally let's compare the original with the copy received back from the remote resource:
[user@client ~]$ diff /tmp/ce.opensciencegrid.org.test.0 /tmp/ce.opensciencegrid.org.test.1 [user@client ~]$ echo $? 0
fileprotocol in the URI is used for files on a local file system. In this case no authentication is performed with the local resource before the file is read or written. Replacing
gsiftpalways requires a GridFTP server running at the resource specified in the URI.
globus-job-runcommand to execute the
idcommand on the remote resource using the default fork Job Manager:
[user@client ~]$ globus-job-run ce.opensciencegrid.org /usr/bin/id uid=506(ligo) gid=506(ligo) groups=506(ligo)Upon success
/usr/bin/idreturns information about the user account which has been used to execute the program itself on the grid resource. This is the account that your Distinguished Name is mapped to. Unlike the Mapping Test this test also verifies that the account exists and that the job manager was able to run the command on your behalf on the compute element.
forkwill be used to fork your program directly on the compute element. As a general rule of thumb you should avoid to use fork and to execute programs on the compute element unless that is your intention. A not complete list of job managers includes
sge. Depending on the setup of the grid resource more than one job manager may be supported. Just like in the previous example we can use
globus-job-runto verify that the
forkjob manager is available and used for the execution:
[user@client ~]$ globus-job-run ce.opensciencegrid.org/jobmanager-fork /usr/bin/id uid=506(ligo) gid=506(ligo) groups=506(ligo)Notice how we append
/jobmanager-forkto the grid resource to explicitly request it. This can be used to detect if a certain job managers such as
[user@client ~]$ globus-job-run ce.opensciencegrid.org/jobmanager-condor /bin/hostname dom118Because the job is scheduled to be executed through HTCondor this command will likely require more time to complete. Here we used the
/bin/hostnamecommand to return the hostname of the worker node executing the job. To specify a job manager that is not supported by the grid resource creates a convenient way to find out what job managers are supported by the resource from the comfort of the command line:
[user@client ~]$ globus-job-run ce.opensciencegrid.org/jobmanager-pbs /bin/hostname GRAM Job submission failed because the gatekeeper failed to find the requested service (error code 93) [user@client ~]$ echo $? 93
2811or the remote resource is down
telnetprogram can be used to connect to the default GridFTP port
2811on the server side to check if the GridFTP? server will answer:
[user@client ~]$ telnet ce.opensciencegrid.org 2811 Trying 22.214.171.124... Connected to ce.opensciencegrid.org (126.96.36.199). Escape character is '^]'. 220 ce.opensciencegrid.org GridFTP Server 2.8 (gcc64dbg, 1217607445-63) [VDT patched 4.0.8] ready. Connection closed by foreign host.GridFTP is available if you received a 'Connected to ...' message. Otherwise you may want to contact the Grid Operation Center and open a ticket for the resource!
globus-job-runto list its content:
[user@client ~]$ globus-job-run ce.opensciencegrid.org /bin/bash -c "ls -al ~" total 17632 drwxr-xr-x 18 ligo ligo 69632 Jun 14 18:40 . drwxr-xr-x 43 root root 4096 Jun 8 14:06 .. -rw------- 1 ligo ligo 1118 Jun 14 16:34 .Xauthority -rw------- 1 ligo ligo 27105 Jun 10 19:50 .bash_history -rw------- 1 ligo ligo 33 Jan 21 2009 .bash_logout -rw------- 1 ligo ligo 414 Sep 28 2009 .bash_profile -rwx------ 1 ligo ligo 225 Jul 14 2009 .bashrc drwx------ 4 ligo ligo 4096 Dec 7 2009 .globus -rw-rw-r-- 1 ligo ligo 36 Mar 17 20:55 .gnuplot_history -rw------- 1 ligo ligo 42 May 28 12:22 .lesshst drwxrwxr-x 3 ligo ligo 4096 Aug 22 2009 .mc drwx------ 2 ligo ligo 4096 Apr 22 15:10 .ssh drwx------ 3 ligo ligo 4096 Jun 13 2009 .subversion -rwxr-xr-x 1 ligo ligo 10582 Jun 8 10:11 .viminfo -rw-r--r-- 1 ligo ligo 1048576 Jun 14 18:13 ce.opensciencegrid.org.test.0If the directory is not available you may try to create it using
[user@client ~]$ globus-job-run ce.opensciencegrid.org /bin/bash -c "mkdir -p /tmp/example" [user@client ~]$ echo $? 0If you are certain that the directory should exist on the server, you may want to contact the Grid Operation Center and open a ticket for the resource!
[user@client ~]$ globus-job-run ce.opensciencegrid.org/jobmanager-fork /bin/bash -c "/bin/mount" /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/xvda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 10.10.1.2:/home on /home type nfs (rw,addr=10.10.1.2) fuse on /mnt/hadoop type fuse (rw,nosuid,nodev,allow_other,default_permissions)
forkwith another job manager like
condorto execute the same test on a worker node in this case.
idprogram on the GridFTP server side:
[user@client ~]$ globus-job-run ce.opensciencegrid.org /bin/bash -c "id" uid=506(ligo) gid=506(ligo) groups=506(ligo)with the ownership and group membership information of the target directory on the GridFTP server ( see here ). If you are certain that the directory should be mounted or be mounted with write-access, you may want to contact the Grid Operation Center to open a ticket for the resource! Job Submission Test fail:
2911or the remote resource is down
telnetto connect to the default GRAM port
2911on the server side:
[user@client ~]$ telnet ce.opensciencegrid.org 2119 Trying 188.8.131.52... Connected to ce.opensciencegrid.org (184.108.40.206). Escape character is '^]'. Connection closed by foreign host.GRAM is running if you receive a 'Connected to ...' message. Otherwise you may want to contact the Grid Operation Center to open a ticket for the resource. Distinguished Name is mapped to, you will receive an error. In this case the Mapping Test will succeed, but the jobmanager will report that it failed to put the job into execution during the Job Submission Test. Please open a ticket with the Grid Operation Center.
forkJob Manager is used to execute your command on the Compute Element. To test if a particular job manager is supported by the resource, try to submit the
idcommand to that job manager:
[user@client ~]$ globus-job-run ce.opensciencegrid.org:2119/jobmanager-pbs /bin/bash -c "id" GRAM Job submission failed because the gatekeeper failed to find the requested service (error code 93)
ccs. The spelling is case-sensitive and differs slightly for GRAM-WS:
CCS! It is also called a Factory Type by GRAM-WS.
globus-job-runinvocations. In particular, it requires that more client network ports be open. Condor-G will listen for incoming connections on the client coming from the CE (so, the Condor-G client acts like a server to some extent). If either client or the remote site has a firewall blocking these ports, Condor-G will fail but
globus-job-runwill succeed. Firewall issues may be difficult to track down. The three possible places where you might run into a firewall are:
GRIDMANAGER.IN_LOWPORT = 8000 GRIDMANAGER.IN_HIGHPORT = 8129After changing this configuration, restart Condor-G. Grid Operation Center (GOC) for addressing issues users have working with production resources. In order to address your issue as quick as possible you should: