[WBEL-users] Still killing me softly with mon
Jesse
j@lumiere.net
Tue, 14 Sep 2004 13:04:42 -0700 (PDT)
Tests will fail from time to time -- pretty much all monitoring devices
encounter this, even commercial load balancers etc with health checks.
I use mon to monitor a few hundred hosts. The key is to use the
'alertafter' option to specify the number of times the test must fail
consecutively before an alarm is issued. I recommend using 2 or 3,
depending on the type of service/test. You'll want to specify upalertafter
as well, as minutes the service must be down before you get an upalert
(otherwise you'll get keeping upalerts without corresponding alerts). SO
upalertafter should be interval * alertafter.
Hope that helps.
For example:
watch mail
service ping
description ICMP ping of mai servers
interval 1m
monitor fping.monitor -r 3
period wd {Mon-Sun}
alertafter 3
alertevery 30m
alert mail.alert <snip>
upalertafter 3m
upalert mail.alert <snip>
---
Jesse <j@lumiere.net>