Categories

Nagios Report Generator Upgrade: nagiosr v1.0.13

Posted: 8:51am Saturday November 11 2006

Category: Software

I just released a hugely improved version of nagiosr: for example it picks up eight alerts from yesterday that the old version missed! (I didn't write the orginal alert matching code--but I've almost entirely rewritten it now.)

  ftp://noc.hep.wisc.edu/pub/src/nagiosr


New Agenda Automation Software

Posted: 8:37am Thursday November 02 2006

Category: Software

The old Agenda System, based on CERN's "CDS" had a serious security hole, so Will's implemented CERN's new version (called "Indico".) The old DIS05 and PHENO06 content has been migrated to the new system, but the old accounts don't work (just re-register if need be.)


Impressive Data Rates

Posted: 5:00pm Thursday October 26 2006

Categories: CMS, Networks

I'm rather certain we set a new record for egress traffic here at the University of Wisconsin. For a few minutes we peaked at 4.2 Gbps and we sustained 3.3 Gbps for a little over 30 minutes. The end-to-end (Fermi to UW HEP) applications (PHEDeX) data rate was around 2.7 Gbps (330 MBps.)


DNS and DHCP Outages

Posted: 7:23am Monday October 23 2006

Category: Outages

At approximately 0205 today, one of the UW-HEP name servers (at 128.104.28.118) died. This caused minor delays in resolving domain names until around 0710--when DNS service was restored. DHCP service is still down, but should be restored shortly.


Terminal Based Interface for Nagios

Posted: 6:59am Thursday October 05 2006

Category: Monitoring

Some recent discussion on the Nagios mailing list reminded me that I've announced my super nifty full-screen terminal interface for Nagios...

  http://noc.hep.wisc.edu/cnagios.html

Cnagios and nagiosr make a darn nice replacement for the Nagios web GUI--in fact I almost never use the web gui anymore!


Garlic Upgrade

Posted: 1:26pm Friday September 29 2006

Category: AFS

When real garlic gets old, it gets moldy and stinky. Fortunately our garlic (an AFS server for OSG file space) didn't get stinky, but it did get too slow for the job. So we've upgraded from a lowly 2 GHz Pentium4 system to a spiffy dual 3.0 GHz Xeon system with 4 GB of memory.


"ProdAgent" Graphs

Posted: 5:00pm Friday September 22 2006

Category: Graphs

Here's some pretty graphs I made recently, I have no idea what they really represent, but I'm sure the under-lying data is completely bogus...

  http://noc.hep.wisc.edu/nrg/tier2/ProdAgent-events.cgi


Disks for the g7nXX Systems

Posted: 12:07pm Thursday September 07 2006

Categories: CMS, Storage

We recently inherited 27 dual 3.0 GHz Xeon compute servers (which were already housed in our machine room.) In order to increase our storage space for our CMS Tier-2 storage facility, we're going to install a bunch of disks in them.

Yesterday I asked for quotes for qty 55 750 GB Seagate 7200.10 disks, but then on vendor said they've seen a 80% failure rate with that drive. So now we're looking at buying 55 500 GB Seagate drives--the NL35.2 model--which has a better MTBF. Oh, well, only another 13.5 TB instead of 20.


Mail Queue Monitoring with Nagios

Posted: 9:40am Tuesday September 05 2006

Categories: Mail, Monitoring

If you know Nagios, then you probably know that that some of it's monitoring scripts (aka "plugins") suck. And that's why I wrote nifty scripts to monitor sendmail mail queue size over snmp today...

   ftp://noc.hep.wisc.edu/pub/src/nagios/


Adaptec RAID Monitoring with Nagios

Posted: 7:51am Tuesday September 05 2006

Category: Monitoring

Adaptec bought a company called DPT a number of years ago. I
have a long and good history of using the RAID controllers
from DPT. A few months ago, I was disappointed to find out
that Adaptec had stopped making DPT-based RAIDs. I emailed one
of the DPT guys at Adaptec and he sent my a very friendly reply
with the skinny. Adaptec no longer makes DPT-based RAID controllers,
but they have folded a lot of the DPT technology into their latest
("aacraid") controllers and software. So we're now standardizing
on the aacraid controllers--which I'm comfortable with because
not-so-little vendors like Dell and Sun integrate aacraid controllers
into their products. And thus I just finished writing a Nagios plugin
to monitor our new aacraid (2130/2230S) controllers...

  ftp://noc.hep.wisc.edu/pub/src/nagios/



Search

Other Links