I realised a while ago that it would be a useful thing to check, occasionally, that all the machines I’m responsible for are still up. (This helps to minimise those embarrassing “Oh, I didn’t know there was anything wrong with it” conversations.).
Thus, the following pretty basic perl script, which I run from /etc/crontab on my own desktop every couple of hours:
#!/usr/bin/perl -w
#
# host_ping.pl - run from crontab
use strict;
use Net::Ping;
use Net::SMTP;
sub sendmail;
my $ping = Net::Ping->new();
my $email = 'me@example.com';
my @host_array = qw/host1 host2 serverA serverB/;
my $hosts_down = "";
foreach my $host (@host_array) {
unless ($ping->ping($host)) {
$hosts_down .= "$host ";
}
}
sendmail() if ($hosts_down ne "");
sub sendmail()
{
# email to me
my $s = Net::SMTP->new('mailserver.example.com');
$s->mail($email);
$s->to($email);
$s->data("Subject: Host(s) down:
$hosts_down","\n","\n");
$s->quit;
}
Also this week, I’ve been organising an engineer for a 4TB RAID 5 array which had 2 disks fall over at the same time. Apparently this is increasingly common with large SATA disks (we had 10 500GB disks) - probably due to the heavy load put on the disks by rebuilding. And of course it renders the RAID5 unusable, so reinstall/restore-from-tape fun on the horizon once the engineer currently in the server room has established that it’s definitely kaput.
The other current project is looking at Puppet. So far I’ve got a server and test client working, and am cautiously optimistic about prospective usefulness. I wish you could readily up the log level without having to run in the foreground, mind. I will doubtless blog more on this in future.
Original post by Juliet Kemp

















