Analyzing Events with CMS Software#

These are instructions for analyzing many events in parallel and merging the output.#


Setting Up#

  1. Figure out the exact dataset you want to run over from the list of Officially produced data stored at Wisconsin. The string you need looks something like these:#

    /QCD_Pt30/Summer09-MC_31X_V3_7TeV_AODSIM-v1/AODSIM
    /PhotonJet_Pt15/Summer09-MC_31X_V3_7TeV_TrackingParticles_AODSIM-v1/AODSIM
    /ZeeJet_Pt0to15/Summer09-MC_31X_V3_7TeV_AODSIM-v1/AODSIM
    
  2. The configuration file is the same as to what would normally be used, but two lines must be edited to look like: #

    ...
    process.source = cms.Source("PoolSource",
                                fileNames = cms.untracked.vstring(
            $inputFileNames
          )
        )
    ...
    process.SimpleAnalyzer.OutputFile            = '$outputFileName'
    ...
    

    The “$inputFileNames” and the “$outputFileName” variables will be replaced by a script later.#

    Here are some example configuration files for reference: MultiPhotonAnalyzer_cfg.py#


Running the Analysis#

  1. Go to your CMSSW folder and setup the environment,#

    cd ~/CMSSW_2_1_0_pre6/src/Analysis/
    eval `scramv1 runtime -sh` 
    
  2. Get a valid grid certificate & make it valid for a decent number of hours#

    voms-proxy-init -valid 40:00
    
  3. Run a script to farm out the analysis jobs to condor. If you’re running on data you produced,#

    farmoutAnalysisJobs FolderName ~/CMSSW_2_1_0_pre6/ ~/CMSSW_2_1_0_pre6/src/Analysis/exampleConfig.cfg
    

    This will submit an analysis job for every root file contained in#

    /hdfs/store/user/$USER/FolderName/
    

    OR if you’re running on a dataset known to DBS, use a command like#

    farmoutAnalysisJobs --input-dbs-path=/ph1j_20_60-alpgen/CMSSW_1_6_7-CSA07-1201165474/RECO ph1j_20_60 ~/CMSSW_1_6_8/ ~/CMSSW_1_6_8/src/Analysis/analgen.cfg
    

    This will also submit a job for every root file contained in the input folder.#

    Here’s an example of a sucessful submit:#

    farmoutAnalysisJobs --input-files-per-job=40 PhotonJet500-1000 ~/CMSSW_2_1_0_pre6/ ~/CMSSW_2_1_0_pre6/src/Analysis/PhotonJetAnalyzer/photonjetanalyzer.cfg
    Generating submit files in /scratch/mbanderson/PhotonJet500-1000-photonjetanalyzer...
    .............
    Submitting job(s).............
    Logging submit event(s).............
    13 job(s) submitted to cluster 11892.
    Jobs for PhotonJet500-1000 are created in /scratch/mbanderson/PhotonJet500-1000-photonjetanalyzer
    Monitor your jobs at
    http://www.hep.wisc.edu/~mbanderson/jobMonitor.php
    
  4. Wait about 2 minutes, then visit http://www.hep.wisc.edu/~YourScreenName/jobMonitor.php To see an auto-generated graph of the status of your jobs.#

  5. If you want more information about your jobs, type condor_q YourScreenName to see your jobs in the queue. This will return something like:#

    -- Submitter: login02.hep.wisc.edu : <144.92.180.5:58221> : login02.hep.wisc.edu
     ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
    11892.0   mbanderson      7/22 08:28   0+00:04:05 R  0   1220.7 cmsRun.sh photonje
    11892.1   mbanderson      7/22 08:28   0+00:04:05 R  0   1220.7 cmsRun.sh photonje
    11892.2   mbanderson      7/22 08:28   0+00:04:06 R  0   1220.7 cmsRun.sh photonje
    
    3 jobs; 0 idle, 3 running, 0 held
    

    For more information use the ID number:#

    jondor_q -l 11892.2
    
  6. Finally, when your jobs finish, your files are now all located in a folder in HDFS,#

     /hdfs/store/user/$USER/AnyName/
    

    To delete, rename, or move files there, type:#

    gsido
    

    This should give you a shell running as the same unix account that owns your files in HDFS. You can then cd to your directory and do what you wish with your files.#


Merging the Final ROOT Files#

  1. When your jobs are all finished, go to the scratch space on your machine#

    cd /scratch/
    

    and type#

    mergeFiles --copy-timeout=10 final.root /hdfs/store/user/$USER/AnyName/ 
    

    And this will create a merged root file in your current directory. (Note: We suggest doing this in your scratch space in case the final root file is very large it may take up too much of your AFS space) Type#

    mergeFiles --help
    

    to see a list of other options.#

Merging Files with Different Cross-sections#

Use the mergeFiles to merge files with the SAME cross-section. Do that until you have a small set of ROOT files with different cross-sections and then you can merge/plot from them in one of two ways:#

  • To correctly merge histograms only, you must download and edit root macro to your current directory, called hadd.C and use it from the root command line like so:#

    root [0] .x hadd.C 
    

    And that will combine the HISTOGRAMS, taking into account cross sections, into one final root file.#

  • To plot from ntuples from multiple files of different cross-sections you download PlotFromFiles.C and then edit and place the following code into a file rootlogon.C in your local directory:#

     G__loadfile("/afs/hep.wisc.edu/home/YourUserName/Folder/PlotFromFiles.C");
    
     // Create a default canvas and histogram.
     // These are used by PlotFromNtuple.C, Plot2hists1D.C, and PlotFromFiles.C
     TCanvas *c1 = new TCanvas("c1","testCanvas",640,480);
     TH1F *h1 = new TH1F("h1","Blah",20,-5,5);
    
     // *******************************************************
     // Specify files for "PlotFromFiles"
     const int NUM_OF_FILES = 6;
    
     TFile *fileArray = new TFile[NUM_OF_FILES];
     fileArray[0] = new TFile( "phtn_jets_20_30-NEW.root"   );
     fileArray[1] = new TFile( "phtn_jets_30_50-NEW.root"   );
     fileArray[2] = new TFile( "phtn_jets_50_80-NEW.root"   );
     fileArray[3] = new TFile( "phtn_jets_80_120-NEW.root"  );
     fileArray[4] = new TFile( "phtn_jets_120_170-NEW.root" );
     fileArray[5] = new TFile( "phtn_jets_170_300-NEW.root" );
    
     // List of Cross-sections divided by num of events produced
     double crossSections[NUM_OF_FILES] = {   1.319E5/49961.,
                                               4.114E4/34646.,
                                               7.210E3/45295.,
                                               1.307E3/8874.,
                                               2.578E2/9281.,
                                               8.709E1/23867.};
     // *******************************************************
    

    Then, when you open root, you can use it at the command line so:#

     root [0] PlotFromFiles("HiEtRecoPhtn","eta","deltaEt>0.08&&deltaEta<0.3",-3.4,3.4,61)
     Saving Images/HiEtRecoPhtn-eta-deltaEtGT0.08-deltaEtaLT0.3.gif
     Info in : GIF file Images/HiEtRecoPhtn-eta-deltaEtGT0.08-deltaEtaLT0.3.gif has been created
     root [1] 
    

    The parameters that must be provided are#

     PlotFromFiles("Ntuple Name","Variable Name","Cuts", x-min, x-max, number-of-bins)
    

Previous Simulating Events & Creating Analyzer#

Home CMS Physics Analysis at UW-HEP#