Re: [PATCH] Machine Check Architecture on amd64
- From: Ed Schouten <ed@xxxxxx>
- Date: Tue, 26 Jun 2007 08:55:20 +0200
* Suleiman Souhlal <ssouhlal@xxxxxxxxxxx> wrote:
Hi,
I have a simple patch for amd64 that uses the Machine Check
Architecture/Exceptions on most recent x86 CPUs to detect memory errors:
http://people.freebsd.org/~ssouhlal/testing/mce-20070621.diff
It will report uncorrected and corrected errors (the latter, only if sysctl
machdep.mce.log_corrected=1).
You can ask the kernel to panic if it gets an uncorrected error by setting
machdep.mce.panic_on_uc=1.
All this can be disabled by setting the machdep.mce.enable tunable to 0. I'm
still not sure if I want this enabled by default, as I don't have any Intel
machines to test this on, but I have tested it on Opteron (both corrected
and uncorrected errors).
I would appreciate it if someone would try this, especially if you have
Intel machines with bad RAM.
Comments are welcome.
| /*
| * Uncorrected MCEs will generate a #MC, while corrected
| * don't, so we have to periodically poll for them.
| */
What about adding an option to only print uncorrected MCE's? That's the
most interesting data and we can get that without using a kthread,
right?
Nice work! :-)
--
Ed Schouten <ed@xxxxxx>
WWW: http://g-rave.nl/
Attachment:
pgpeymsH3eFY0.pgp
Description: PGP signature
- Follow-Ups:
- Re: [PATCH] Machine Check Architecture on amd64
- From: Suleiman Souhlal
- Re: [PATCH] Machine Check Architecture on amd64
- References:
- [PATCH] Machine Check Architecture on amd64
- From: Suleiman Souhlal
- [PATCH] Machine Check Architecture on amd64
- Prev by Date: Re: [head tinderbox] failure on i386/i386
- Next by Date: Re: [PATCH] Machine Check Architecture on amd64
- Previous by thread: [PATCH] Machine Check Architecture on amd64
- Next by thread: Re: [PATCH] Machine Check Architecture on amd64
- Index(es):
Relevant Pages
|