Re: random hangs/reboots with Dell servers




Thnx to everyone for your replies,

A colleague has provided me with his hand notes of an older crash screen, it has the following(however i cant guarantee it is accurate, it is handnotes).

Fatal trap 12: page fault while in kernel mode
cpuid=0; apicid=00
fault virtual address=0xac
fault code=supervisor write,page not present
instruction pointer=0x20:0x
current process 79962
trap numbers : 12
panic: pagefault
cpuid=1
uptime=6d7423m55


I do not believe the problems are related to envriroment or electricity, since during the period the problems occured we have switched data center, and in addition to dell systems there are 150 more nodes from various vendors (HP mostly, but also IBM, supermicro, SUN, and various assembled towers), and none has shown similar behaviour. We dont run FreeBSD on them though. We have a Dell 2850 with Windows 2003 that has been running rock solid for at least 1 year. And the 1750 that under FreeBSD 5 would sometimes crash even under no load, with RHEL 4 pushes 60 Mbps of ftp data 24/7 with ease for the last year without any problems.

Disabling everything from BIOS was one of our first moves, though we havent disabled usb since sometimes we need to connect a keyboard. And no IPMI is running on a public interface:)

Apart from all the nodes being SMP and Dell, I cannot think of anything else in common. Some are SCSI, some are SATA. All have a number of jails. Memory size is 2 GB (the 1750), the others have 4 GB.

I have also asked Dell for some help, though they told me freebsd is not certified by Dell, they will try to look into it.


--
============================================================================

Dimitris Zilaskos

Department of Physics @ Aristotle University of Thessaloniki , Greece
PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum : de2bd8f73d545f0e4caf3096894ad83f pgp_public_key.asc
============================================================================

_______________________________________________
freebsd-questions@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Errors found in Freebsd
    ... already but I have received no response and it has been quite some time. ... What EXACT model of Dell server are you working with here? ... Logging a "fault" on the website will liably get you no response without ... FreeBSD, much like most OpenSource UNIX-like OSs, is not a commercial ...
    (freebsd-questions)
  • Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)
    ... GDB is free software, covered by the GNU General Public License, and you are ... page fault while in kernel mode ... Bear in mind that a recent "urgent" firmware update was released by Dell ...
    (freebsd-stable)
  • Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)
    ... GDB is free software, covered by the GNU General Public License, and you are ... page fault while in kernel mode ... Bear in mind that a recent "urgent" firmware update was released by Dell ...
    (freebsd-stable)
  • Re: new cert coming
    ... but that is no fault of the employee. ... If Dell is providing ... > to primarily English-speaking countries, ... > *his* fault that he's going to get hired before I am. ...
    (microsoft.public.cert.exam.mcse)
  • Re: Kernel Panic on FreeBSD 6.1 with Dell PE1850 / 1750
    ... I got a handful of Dell PE1850 and PE1750 boxes running mail scanning that we recently upgraded to FreeBSD 6.1. ... page fault while in kernel mode ...
    (freebsd-questions)