<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Interesting, I will set up netdump then, looks like a very good way to

learn something more about my system and Linux internals in general :)

. We actually have 2 servers running mail filtering, so I can dump one

over the other and vice versa. This morning I moved one HD in a

different position to get a better dissipation and it looks like I

achieved a - 10 &deg;C result which is not bad. Now the warm one is 50&deg;C vs

the &gt;59&deg;C it was before. I will also run memtest (I checked the

memory before the install) for sometime and see if I get more

info...... actually the memory is the only thing I took from the first

installation, so it would make sense......<br>

I'll let you know, thanks, have a nice day<br>

<br>

Simone<br>

&nbsp;<br>

Kirby C. Bohling wrote:

<blockquote cite="mid20050412214942.GF9706@birddog.com" type="cite">

  <pre wrap="">On Tue, Apr 12, 2005 at 11:38:12AM +0200, Simone wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Hi list,

I periodically experience kernel panics on my wbel3 box. This server is 

running as a front end mail filter for exchange, with a typical 

MailScanner - sendmail - clamav setup. I also have mailwatch for 

mailscanner running, which is a LAMP kind setup. Every 3-4 weeks I have 

a crash, with keyboard blinking lights, and I don't understand the 

reason for it. Recently installed a new server same configuration, 2 x 

18Gb scsi disks raid1, thinking it was possibly a hardware problem, but 

this morning after a month running fine, I had the first crash. Could 

you please tell me where to look for possible indications on what could 

have caused the panic? Checked log/messages but it looks like no useful 

info is in there.

Thanks for your suggestions

Have a fine day

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Simone,

        Well, obviously this isn't terribly proactive, but one thing you

might try is setting up netdump.  I've never set it up, but I'm

intending to sooner rather then later.  Essentially, we have a

number of machines that kernel panic, while screen blanking is on.

So you don't get any information.

        In theory netdump will dump core over a network so that you

capture the state of the machine at the time of the crash and can do

the debugging.  Okay, I'm guessing you can't otherwise you wouldn't

need this suggestion (I can't either).  However, you might be able

to use the symbols and backtrace information from the oops to track

down what the cause is via googling.  With an oops, you can use that

to as a starting point.

        In order to do this, you'll need another machine running linux,

with enough free disk space as the machines you are dumping from

have RAM + SWAP if I remember correctly.

        If I were you, I'd put memtest86 in the machine and let it run

for a couple of days.  It sure sounds like memory corruption or

overheating.  I've had machines with memory corruption that ran for

weeks or months at a time.  It wasn't until something critical got

stored where the bad bits where that it crashed.

        Kirby

_______________________________________________

Whitebox-users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Whitebox-users@beau.org">Whitebox-users@beau.org</a>

<a class="moz-txt-link-freetext" href="http://beau.org/mailman/listinfo/whitebox-users">http://beau.org/mailman/listinfo/whitebox-users</a>

  </pre>

</blockquote>

<br>

</body>

</html>