Cold boot#

What to check after a power cut:#

  • Check ganlia services [ http://ganglia.hep.wisc.edu/ganglia/ ]
  • Check nagios services [ https://icinga.hep.wisc.edu/icinga ]
  • Check HDFS main server (cmshdfsnn/cmshdfs01)
  • Check Xrootd main server (cmsxrootd/cmshdfs02)
  • Check SRM main server (cmssrm/cmshdfs03)
  • Check GUMS server/services (gums01)
  • Check cmsgrid01 services
  • Check cmsgrid02 services
  • Check cmsgrid03 services
  • Check all VMhosts/services
  • Check Condor services
  • Check SAM3 metrics/plots
  • Check Gratia services
  • Check phedex services (phedex02/VM)
  • Check squid/frontier services (frontier01, frontier02, frontier03, frontier04)
  • Check cvmfscacheXXX services (cvmfscache01, cvmfscache02, cvmfscache03, cvmfscache04)
  • Check restart proxy_renewer on cron02. Example:#

    /cms/cmsprod/proxy_renewer/proxy_renewer --voms=cms --valid=200:00 --chown=cmsprod