The following are the most important computing resources available to CMS Tier-2 users:
Condor may be used directly, without grid middleware, on our cluster. The jobs submitted flock all over the UW campus grid. In particular, the jobs run on HEP Tier-2, Grid Laboratory Of Wisconsin (GLOW) or the CS department Condor cluster. The only storage areas common to this entire domain are AFS and dCache. AFS is used for software export and user analysis data storage. XRootD or dCache are used for CMS data access (read only), and organized production (read/write). In the future, dCache space will be available for users to write data as well.
Condor can run several types of jobs, some more efficiently than others. There is no requirement of any time limit on individual jobs. However, when high priority jobs of the owners of a particular node on the campus grid claim their resource lower priority jobs are evicted. The Standard Universe jobs are able to automatically migrate to other resources when they become available without loosing prior execution results. However, CMS jobs are not able to operate in the restricted Standard Universe. Therefore, we build them normally as we do at CERN, and operate in Vanilla universe. Vanilla Universe jobs do not restart from where preempted; unless they do their own checkpointing, they are restarted from the beginning.
The batch jobs are started by submitting a script to condor: condor_submit <script>
You can watch the progress of your jobs using: condor_q
You can watch the progress of all jobs running on the system using: condor_q -global
You can check the status of the compute pools using:
condor_status -pool condor.hep.wisc.edu
condor_status -pool glow.cs.wisc.edu
condor_status -pool cm.chtc.wisc.edu
condor_status -pool condor.cs.wisc.edu
The condor submission script is fully documented elsewhere. A sample script which submits a single CMSSW based job is below:
Executable = /path/to/cmsRun
Arguments = myjob.cfg
ok GetEnv = true
Universe = Vanilla
Transfer_Input_Files = myjob.cfg
output = myjob.out
error = myjob.err
Log = myjob.condorlog
Copy_To_Spool = false
Notification = never
WhenToTransferOutput = On_Exit
on_exit_hold = (ExitBySignal =?= True || ExitStatus =!= 0)
Requirements = (TARGET.OSRedHatRelease =?= "Scientific Linux SL Release 3.0.4 (SL)" || TARGET.OsRedHatRelease =?= "Scientific Linux SL release 4.4 (Beryllium)" && TARGET.HasAFS =!= FALSE )
Queue
This submit script submits the cmsRun executable with a configuration file to the Condor system. The output is retrieved upon completion of the job to the submit directory. You should normally submit the job from a subdirectory of /scratch/username. If you really must submit the job from AFS, see the recommendation here for how to set up a suitable directory.
The long "Requirements=" expression ensures that jobs run on machines that are SL3 or SL4 compatible.
You may use my sample shell script that submits several CMSSW jobs to fully process a dataset. This automatically submits the jobs from an appropriate /scratch directory where logs will accumulate. The data files are copied to dCache.
Example simulation submission:
/afs/hep.wisc.edu/cms/cmsprod/bin/farmoutRandomSeedJobs dataset-name total-events events-per-job /path/to/CMSSW /path/to/configTemplate.cfg
Example analysis submission:
/afs/hep.wisc.edu/cms/cmsprod/bin/farmoutAnalysisJobs jobName /path/to/CMSSW /path/to/configTemplate.cfg
Use the --help option to see the options used by these scripts.
http://www.hep.wisc.edu/~cmsprod/farmoutCmsJobs
Your data files will be stored in dCache under /pnfs/hep.wisc.edu/data5/uscms01/username. See the FAQ for more information on how to manage your files.
The Wisconsin CMS Tier-2 is part of the Open Science Grid (OSG). Our
Globus gatekeeper is cmsgrid02.hep.wisc.edu and our site name is GLOW-CMS or T2_US_Wisconsin, depending on which information system you are using.
You can submit analysis jobs to our site through the CMS CRAB tool.
Instructions for using CRAB from UW-HEP are here.
For further information on submitting jobs through Open Science
Grid, see "Grid Support"/"Support For Users" on the OSG home page.
Running Jobs through the Grid