High Energy Physics CMS Tier-2 Facilities#

Overview of Resources#

The following are the most important computing resources available to CMS Tier-2 users:#

  • SSH login access: login.hep.wisc.edu
  • Open Science Grid gatekeeper: cmsgrid01.hep.wisc.edu, cmsgrid02.hep.wisc.edu (GLOW)
  • UW Campus Condor Grid#

    • glow.hep.wisc.edu: pool of 3000 CPU cores. Includes the Wisconsin CMS Tier-2 resources, which give 1st priority to CMS users.
    • cm.chtc.wisc.edu: Wisconsin’s Center for High Throughput Computing (CHTC). Pool of 1500 CPU cores.
    • condor.hep.wisc.edu: pool of desktop machines and submit hosts
    • Typically we (CMS) have access to 1000s of CPU coress at any time - Jobs that you submit from hep.wisc.edu “flock” to all the other sites beyond condor.hep.wisc.edu. Do not hesitate submitting as many jobs as you need to.
  • 1000 TB (and growing) of storage space in HDFS#

Batch Processing using Condor#

Condor may be used directly for running CMS jobs on our cluster. Once submitted, the jobs flock all over the UW campus grid. In particular, the jobs run on Grid Laboratory Of Wisconsin (GLOW) or Wisconsin’s Center for High Throughput Computing (CHTC) cluster. The only storage areas common to this entire domain are AFS and HDFS. AFS is used for software. HDFS is used for CMS data files.#

It is best to design jobs to have a runtime around 12 hours or less to avoid wasting a lot of computing time whenever jobs are preempted by other jobs with higher priority. On CMS-controlled resources, job preemption may only happen after the job has been running for 24 hours, so definitely avoid jobs with runtimes longer than 24 hours.#

Abridged instructions for CMS Users#

The batch jobs are started by submitting a script to condor: condor_submit#

You can watch the progress of your jobs using: condor_q#

You can watch the progress of all jobs running on the system using: condor_q -global#

You can check the status of the compute pools using:#

condor_status -pool condor.hep.wisc.edu
condor_status -pool glow.cs.wisc.edu
condor_status -pool cm.chtc.wisc.edu

You can ssh to a job to debug it interactively using:#

condor_ssh_to_job *jobid-here*

“farmout” shell script for submitting a bunch of CMSSW jobs#

You may use my sample shell script called “farmout” that submits CMSSW jobs to fully process a dataset. This automatically submits the jobs from an appropriate /scratch directory where logs will accumulate. The data files are copied to HDFS.#

Example simulation submission:#

/afs/hep.wisc.edu/cms/cmsprod/bin/farmoutRandomSeedJobs \
    dataset-name \
    total-events \
    events-per-job \
    /path/to/CMSSW \
    /path/to/configTemplate.py

Example analysis submission:#

/afs/hep.wisc.edu/cms/cmsprod/bin/farmoutAnalysisJobs \
    jobName \
    /path/to/CMSSW \
    /path/to/configTemplate.py

Use the –help option to see the options used by these scripts.#

Your data files will be stored in HDFS under /hdfs/store/user/username. See the FAQ for more information on how to manage your files, how to use farmout, and how to use Condor.#

Running Jobs through the Grid#

The Wisconsin CMS Tier-2 is part of the Open Science Grid (OSG). Our Globus gatekeeper is cmsgrid02.hep.wisc.edu and our site name is GLOW-CMS or T2_US_Wisconsin, depending on which information system you are using.#

You can submit analysis jobs to our site through the CMS CRAB tool. Instructions for using CRAB from UW-HEP are here.#

For further information on submitting jobs through Open Science Grid, see “Grid Support”/“Support For Users” on the OSG home page.#