AlphaServer 1000A has CIA machine check ECC errors and fails to reboot

From: Geert Van Pamel (geertivp-030421_at_belgacom.net)
Date: 03/08/04


Date: Mon, 8 Mar 2004 19:57:38 +0100

Hello Alpha fellows,

Original symptoms: I could only ping the server, telnet was not possible any
more...

After connecting a laptop on the console serial port I found the following
on the console log (repeatedly)

CIA machine check: vector=0x630 pc=0xfffffc0000855700 code=0x86
machine check type: correctable ECC error (retryable)

CIA machine check: vector=0x630 pc=0xfffffc00008556f4 code=0x86
machine check type: correctable ECC error (retryable)

... etc

I pressed the reset button to try to reboot...

. start booting
...
probing PCI-to-EISA bridge, bus 1
probing PCI-to-PCI bridge, bus 2
bus 2, slot 0 -- pka -- QLogic ISP1020
bus 0, slot 11 -- ewa -- DECchip 21140-AA
ed.ec.eb..

But then I got an endless series of error messages such as:

Processor correctable error through vector 00000063.

EI_STAT: FFFFFFF0C5FFFFFF EI_ADDR: FFFFFF00010500CF
FILL_SYN: 0000000000000094 ISR: 0000000100000000

Processor correctable error through vector 00000063.

EI_STAT: FFFFFFF0C5FFFFFF EI_ADDR: FFFFFF0001041D8F
FILL_SYN: 0000000000000094 ISR: 0000000100000000

. etc

But the boot failed, because the above error repeated itself continuously...

What advice could you give me?

Do I have a memory problem, I suppose?

How can I find which RAM chips could be in failure ?

AlphaServer 1000A running Linux Red Hat 7.2

Thanks, Geert/


Quantcast