Q1. I am in posession of a program that requires large amounts of CPU time to
run, and I would like to use the Wisconsin grid for running this program.?
You can submit your job from your working directory in AFS or in a
local(NFS) disk. But you have to tell Condor whether 1) you want
to transfer the input/output files or not, and 2) to which
directory, after the job completion. Depending upon the
settings 1) and 2), Condor will perform the desired task.If you don't
explicitely provide the names of the input/output directories then
your submit directory will be assumed for all input/output operation purposes.
The basic thing is that a directoy in AFS should have the appropriate
permission for a user to read/write from/to that directory. This is
achieved in two ways : user's (valid) AFS token + r/w permission OR
host-specific access-control (ACL). Since Condor doesn't have the
ability to pass Tokens, the only way to make sure that the
r/w opeation from/to the input/output directory succeeds is by
allowing all condor-hosts (submit and execute) to do so. For NFS directories this is not an
issue. More detailed descriptions can be found
elsewhere.
If all of your files and executables are in a AFS directory from
where you want to submit your job then follow instructions in (a). If they are in a local disk in a machine from where you
want to submit your job then follow the instruction in (b).
(a) Your files and executables are in a AFS directory :
- First, you must make that directory writable by Condor hosts using the
following command : fs setacl -dir $your_work_dir -acl condor-hosts
rlidwk
(Why do I need to do this ?
- Then, prepare a job description file that will be used for
condor_submit : job_description_file
(How do I prepare a
job description file ? See
Basic Examples). A more realistic example template is the following :
Executable = $path_to_your_afs_work_directory/your_executable_name (ex: a.out)
Arguments = $arguments_to_the_Executable (ex : a.out $arg1 $arg2 ...)
GetEnv = true
Universe = Vanilla
Transfer_Input_Files = $input_file_for_the_job
output = $output_file_name.out
error = $error_file_name.err
Log = $log_file_name.log
Copy_To_Spool = false
Notification = never
should_transfer_files = YES
WhenToTransferOutput = On_Exit
on_exit_remove = (ExitBySignal == FALSE && ExitStatus == 0)
Requirements = (TARGET.OSRedHatRelease =?= "Scientific Linux SL Release 3.0.4 (SL)" || TARGET.AFS_SYSNAME =?= "i386_tao10") && (TARGET.RebootedDaily=!=TRUE && TARGET.HasAFS =!= FALSE && TARGET.RemoteUser =!= "uscms01@hep.wisc.edu" && TARGET.RemoteUser =!= "cmsprod@hep.wisc.edu")
Queue
(What is the meaning of the above variables and what do they do ? See