Dead Fuel

From: Atro Tossavainen (Atro.Tossavainen+news_at_helsinki.finland.invalid)
Date: 04/13/05


Date: 13 Apr 2005 09:36:01 +0300

Our Fuel recently started refusing to boot up.

Attaching a serial console, I can see the bootup messages. All the
initial diagnostics go through just fine. Attempting to go to single
user from the hard drive or to boot miniroot from CD both result in
a panic. I was smart enough not to have installed the Customer
Diagnostics CD when the machine was still alive, so I haven't gotten
any further in diagnosing the issue.

The machine is no longer under warranty and we don't have a service
contract either. (I know - massively stupid.)

SGI believe it's a case of fried CPU and are telling us to buy a
new or refurbished machine because just shipping a replacement CPU
over from abroad will cost over 5000 euros, which would probably be
more than what they would charge us for a refurb Octane2 with V12.

A new Fuel with a 700 MHz R16000 and 1 GB of RAM now costs 14500
euros (list price excl VAT; we would probably get an edu discount),
and three years of FullCare would cost slightly over EUR 2700 (ditto).

Anybody with any ideas, or a spare R14000 they want to part with for
a reasonable price? :-)

Here's the boot log.

Discovering local IO ...................... DONE
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 30361 usec
Waiting for peers to complete discovery.... DONE
No other nodes present; becoming global master
Global master is /hw/rack/001/bay/01
Intializing any CPUless nodes.............. DONE
Checking partitioning information ......... DONE
No other nodes present; becoming partition master
Loading BASEIO prom ....................... DONE

BASEIO PROM Monitor SGI Version 6.201 built 08:56:46 AM Jun 2, 2004 (BE64)
1 CPUs on 1 nodes found.
Automatic update of PROM environment disabled

PS/2 Keyboard & Mouse diagnostics
    No keyboard found, skipping keyboard test
    No mouse found, skipping mouse test

    Missing PS/2 device(s) AND console set to "g"
PS/2 Keyboard & Mouse diagnostics passed with a possible problem

Graphics diagnostics

Odyssey board #0 found on nasid 0
Running Odyssey xtalk sanity diag...
        Board version 1 - Buzz revision 2B
        On board sdram size: 128 Mb
        Cas latency: CAS 3
        4 banks by sdram module
Running Odyssey Buzz registers diag...
Device passed diagnostics

Installing PROM Device drivers ............
Base I/O Ethernet set to /dev/ethernet/ef0
Installing Graphics Console...
graphics install: searching for pipe 0

Walking SCSI Adapter 0, (pci id 1)
1+ Device Vendor Product: SGI ST318406LW
2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 1 device(s)

Walking SCSI Adapter 1, (pci id 1)
1- 2- 3+ Device Vendor Product: TEAC CD-W512SB
4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 1 device(s)

Initializing PROM Device drivers .......... DONE
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc3pckm0 for input
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc3pckm0 for input
Checking hardware inventory ............... DONE

**** System Configuration and Diagnostics Summary ****
CONFIG:
         No. of NODEs enabled = 1
         No. of NODEs disabled = 0
         No. of CPUs enabled = 1
         No. of CPUs disabled = 0
         Mem enabled = 512 MB
         Mem disabled = 0 MB
         No. of RTRs enabled = 0
         No. of RTRs disabled = 0

DIAG RESULTS:
         ALL DIAGS PASSED.
**** End System Configuration and Diagnostics Summary ****

System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option? 5
Command Monitor. Type "exit" to return to the menu.

>> boot
896+111704+16853+3848 entry: 0xa8000000012a6ea8
Standalone Shell SGI Version 6.5 ARCS Jul 7, 2004 (64 Bit)
sash: boot
6352575+1298752+1051600 entry: 0xa8000000000418e0
IRIX Release 6.5 IP35 Version 07080049 System V - 64 Bit
Copyright 1987-2004 Silicon Graphics, Inc.
All Rights Reserved.

WARNING: odsy dfifo timeout
[repeated many times]

WARNING: Xbow at /hw/module/001c01/Ibrick/xtalk/13 encountered Fatal error
xtalk PIO Read error in kernel mode
        widgetnum: 0xd
        srccpu: 0x0
        srcnode: 0x0
        errnode: 0x0
        sysioaddr: 0xd23241c
        xtalkaddr: 0x23241c
        vaddr: 0x920000000d23241c
        epc: 0xc00000000047d850
        ef: 0xa8000000011c3688
(NOTE: CPU 0=/hw/module/001c01/node/cpubus/0/a)

ALERT: PIO Error on XIO Bus /hw/module/001c01/node/xtalk/0/xbow[xbow# 0] port 13
        Access attempted to offset 0x23241c
kl_ioerror_handler Failed handling PIO Read error in kernel mode
        widgetnum: 0xd
        srccpu: 0x0
        srcnode: 0x0
        errnode: 0x0
        sysioaddr: 0xd23241c
        xtalkaddr: 0x23241c
        vaddr: 0x920000000d23241c
        epc: 0xc00000000047d850
        ef: 0xa8000000011c3688
(NOTE: CPU 0=/hw/module/001c01/node/cpubus/0/a)

HARDWARE ERROR STATE:
+ Errors on node Nasid 0x0 (0)
+ IP35 in /hw/module/001c01/node [serial number MMN561]
+ BEDROCK signalled following errors.
+ BEDROCK PI 0 Error Interrupt Register: 0x200000
+ 21: CPU A received uncorrectable error during uncached load
+ BEDROCK PI 0 Error Status 0 A Register: 0x800348c907600006
+ 02<->00: rrb error type 6 Response Data Error
+ 24<->17: message command 0xb0 Reply(PRPLY)
+ 61<->25: error address 0x1a46483 << 3 = (0xd232418)
+ 63<->62: error status valid (no over_run)
+ BEDROCK PI 0 Error Status 1 A Register: 0x10c80000400000
+ 28<->21: crb timeout counter 0x2
+ 52<->43: crb status 0x219
End Hardware Error State
++FRU ANALYSIS BEGIN
No rules triggered: Insufficient data

Timeout Histogram is empty.

++FRU ANALYSIS END
PANIC: Kernel Data Bus error in IO space at physical address 0xd23241c (EPC 0xc
00000000047d850)
Restarting the machine...

Trying to boot miniroot from CD:

...
QLFC: running as interrupt thread.
QLFC: using spinlocks.
WARNING: Xbow at /hw encountered Fatal error
xtalk PIO Read error in kernel mode
        [tons of parameters similar to above]
kl_ioerror_handler Failed handling PIO Read error in kernel mode
        [tons of parameters similar to above]

-- 
Atro Tossavainen (Mr.)               / The Institute of Biotechnology at
Systems Analyst, Techno-Amish &     / the University of Helsinki, Finland,
+358-9-19158939  UNIX Dinosaur     / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS


Relevant Pages

  • Re: computer keep shuting down
    ... computer shuts down .You can not boot up unless you turn off ... fix it by reloading Windows may have created a new problem - defective ... a solution by trying to fix something before discovering the problem. ... into other options (ie manufacturer comprehensive diagnostics) by ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Help: Vista in Endless Reboots
    ... shut down for a system update - wouldn't start up. ... System wouldn't revevert to last know good config (well it may have, ... wouldn't boot in any case), but I could boot in safe mode and ... Diagnostics that should have taken maybe one hour to ...
    (microsoft.public.windows.vista.general)
  • Re: Unmountable Boot Volume problem
    ... The BIOS test is not near a comprehensive as the diagnostics most ... Is there a way to load pro on then adjust ... You may need to perform a Clean Installation without their ... XP boot screen, it then stops the installation and displays ...
    (microsoft.public.windowsxp.setup_deployment)
  • Re: RS6000 MCA correct error procedure
    ... fix the problem it sounds like the internal disk may be bad. ... Did a diagnostics check on I/O Planar on 7012-320 and got ... 817-123 The I/O Planar time of day clock test failed. ... another boot up in service mode that the date was 1970. ...
    (comp.sys.ibm.ps2.hardware)
  • Re: What laws did PC repair co. break by hacking my laptop?
    ... over a laptop keyboard it will be more than just the keyboard ... You just need the manufacturer's diagnostics disc. ... Your comment is hyperbole - but the point is fair; in order to repair ...
    (uk.legal)