[WBEL-users] Still killing me softly with mon

Kirby C. Bohling kbohling@birddog.com
Tue, 14 Sep 2004 16:35:34 -0500


On Tue, Sep 14, 2004 at 02:21:23PM -0700, Ed Morrison wrote:
> > I recommend using 2 or 3,
> > depending on the type of service/test. You'll want to specify upalertafter
> > as well, as minutes the service must be down before you get an upalert
> > (otherwise you'll get keeping upalerts without corresponding alerts). SO
> > upalertafter should be interval * alertafter.
> 
> 
> Jesse,
> 
> Thanks for the help.  I definitely needed to add those lines to my
> config.  Unfortunately, mon still doesn't recognize anything it is
> watching to be up.  It's as though it doesn't receive the replies, which
> is why I thought it had to do with my iptables or my firewall.  Anymore
> thoughts would be appreciated.
> 

I've never used mon before, but how about checking the obvious
things:

1.  What are the results of running ping by hand?  (both by IP and
by name).

2.  What is the firewall configuation?

3.  Can you use wget to pull down web pages from that machine (from
anywhere local and/or remote)?

4.  Can you use the various servers it claims are down from that
machine?

5.  Are the monitoring scripts separate, have you tried running them
by hand?  If they aren't runnable by hand, find other software, I'd
recommend nagios.  DAG builds it for RHEL3.0.

6.  Have you written the "I can't fail monitoring script", and tried
using it, and seeing what it gives back as a result.  It might be a
problem with the plugin architecture, or a permissions problem.

Have you tried running strace on processes to see what error codes
you are getting from syscalls?  

Have you run ethereal on either or both ends to see what they are
seeing?

Where did you get the RPM, did you recompile it for WBEL, or was it
pre-built for a different distro?

	My best guess is the binary RPM you installed is the same on on
RH8.0 as on WBEL3.0 and that's the source of the problem. It's
forking a process which can't find the appropriate dynamic libraries
and dies immediatly after forking (I've seen this before), or it's
exec'ing an executable that doesn't exist, or doesn't exist in the
path that did exist on the RH8.0 machine.

	I've never run "mon", and never heard of it until this exchange
of e-mails.  So take all this with a grain of salt.  However, it
should be easy to track down your problem if the application is
reasonable well written.  "strace -f" is your friend in a lot of
cases.

	If you tell me where you got "mon" from, I'd happily look into
seeing how it works enough to tell you if I can get the default out
of the box configuration to work by pining a machine.

	Thanks,
		Kirby