Condor can be downloaded from http://www.cs.wisc.edu/condor/downloads/v6.8.license.html. You will need to enter you name and email address.
If you chose a tarball unpack it and run the installation script. Follow the instructions on the screen. Most of the options can be left as is, but you should answer yes to the question "Would you like to setup this host as a submit-only machine?".
In this section we will only examine how to construct the JDL file. So first you have to prepare your job as you normally would.
First you have to specify that you want to submit to a grid resource. This is done by using the grid universe:
universe=grid
Then you have to specify that you want a nordugrid resource and which one you want. Condor does not do brokering of NG resources:
grid_resource = nordugrid morpheus.dcgc.dk
That done you can now use standard JDL notation to specify the executable, stdout, stderr and logs:
executable = myjob
output = my.out
error = my.err
log = my.log
If you need to specify arguments to the job you can do so by using the JDL args attribute. Condor will translate args to ARC 0.4 xRSL arguments i.e. it will prepend the executable name to the argument list.
args= -l (this is translated to : (arguments= myjob -l))
To transfer other files than the executable and stdout, -err you need to add the following tags:
transfer_input_files = benchmark.pov
transfer_output_files = benchmark.png
And to tell Condor when to transfer the files:
WhenToTransferOutput = ON_EXIT
Because of a bug in condor (<= v.6.8.2) you must add the executable once again in the nordugrid_rsl attribute. Here you can also add any other xRSL attribute you might need to set, like runtimeenvironment for instance.
nordugrid_rsl = (executable=runpov.sh)(runtimeenvironment=APPS/GRAPH/POVRAY-3.6)
Finally, add queue to the JDL file:
queue
Example job:
universe = grid
executable = runpov.sh
WhenToTransferOutput = ON_EXIT
transfer_input_files = benchmark.pov
transfer_output_files = benchmark.png
output = pov.$(Cluster).out
error = pov.$(Cluster).err
log = pov.$(Cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl = (executable=runpov.sh)(runtimeenvironment=APPS/GRAPH/POVRAY-3.6)
queue
This job can then be submitted with:
condor_submit test.jdl
The following jobs have been run to test condor NG submission.
Simple hello world test.
universe = grid
executable = myjob
output = my$(Cluster).out
error = my$(cluster).err
log = my$(cluster).log
grid_resource = nordugrid interop.dcgc.dk
queue
#!/bin/sh echo "hello world!"
Fails
Bug in Condor submission. (See here)
Simple hello world test w/ workaround.
universe = grid
executable = myjob
output = my$(Cluster).out
error = my$(cluster).err
log = my$(cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl = (executable=myjob)
queue
#!/bin/sh
echo "hello world!"
Success.
Job that transfers files to and from CE and uses a runtime environment.
universe = grid
executable = runpov.sh
WhenToTransferOutput = ON_EXIT
transfer_input_files = benchmark.pov
transfer_output_files = benchmark.png
output = pov.$(Cluster).out
error = pov.$(Cluster).err
log = pov.$(Cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl =
(executable=runpov.sh)(runtimeenvironment=APPS/GRAPH/POVRAY-3.6)
queue
#! /bin/sh
echo $PATH
povray -H200 -W200 -d benchmark.pov
Success.
Same as test 3 but uses arguments.
universe = grid
executable = runpov.sh
args = -d benchmark.pov
WhenToTransferOutput = ON_EXIT
transfer_input_files = benchmark.pov
transfer_output_files = benchmark.png
output = pov.$(Cluster).out
error = pov.$(Cluster).err
log = pov.$(Cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl =
(executable=runpov.sh)(runtimeenvironment=APPS/GRAPH/POVRAY-3.6)
queue
#! /bin/sh
echo $PATH
povray -H200 -W200 $@
Fails.
Condor uses ARC v 0.4 syntax for arguments passing not compatible with ARC 0.5+.
Same as test 4 but uses rsl_arguments and multiple executables.
universe = grid
executable = runpov.sh
WhenToTransferOutput = ON_EXIT
transfer_input_files = benchmark.pov, post.sh
transfer_output_files = benchmark.png, post.txt
output = pov.$(Cluster).out
error = pov.$(Cluster).err
log = pov.$(Cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl = (executable=runpov.sh)(runtimeenvironment=APPS/GRAPH/POVRAY-3.6)(arguments=-d benchmark.pov)
queue
#! /bin/sh
echo $PATH
povray -H200 -W200 $@
post.sh
echo “done”>post.txt
Failure.
Condor still overrides arguments tag. Execute permission bit not kept on transfer and Condor does not add those files to (executables=) attribute.
Job that need output in a separate directory.
universe = grid
executable = /bin/echo
output = /tmp/my$(Cluster).out
error = my$(Cluster).err
log = my$(Cluster).log
grid_resource = nordugrid interop.dcgc.dk
nordugrid_rsl=(executable=/bin/echo)
queue
Failure.
Condor correctly submits this example, but tries to retrive the file /tmp/my??.out instead of retrieving my??.out to /tmp/my??.out on the local machine.
As the tests showed standard jobs fail to submit to an ARC server.
Analisys of the submitted xRSL script revealed that Condor translates
the JDL executable = job.sh
to ARC (executables=job.sh)
.
While this is valid ARC xRSL, it tells ARC which files to grant
executable rights, it does not tell ARC which file contains the
actual job. This also makes it impossible to run a multiple
executable job.
Since Condor does not set the executable attribute we can do so ourselves by adding the following line to our JDL file:
nordugrid_rsl= (executable=job.sh)
Change job.sh to whatever you put in the JDL executable.
To run multiple executable jobs, the job mentioned in the executable tag should then run chmod to change the permissions on any other executable.
This bug has been fixed, and should be ready in the next release of Condor.
Using directories in the file statements results in condor trying to retrieve faulty filenames from the NG CE.
Do not use directory names in output files. I.e. output=/tmp/out.std will not work.
Arguments are handled in a fashion compatible with v. 0.4.. of ARC.
Remember to treat arguments in a 0.4 compatible way and if necessary specify (middleware<=0.4.5) in nordugrid_rsl
At the moment, there is not way to specify non local files in the
JDL. That is there is no equivalent statement to the xRSL
(outputfiles=(in.txt "gsiftp://..../in.txt")
.
Jobs can only take local input and will either retrieve the files back to the local disk or leave them on the execution cluster.