ULtra 10 crashing

From: Dominic Clarke (dominicc_at_foe.co.uk)
Date: 05/31/05

  • Next message: Daryl.Mitchell_at_us.o-i.com: "to mirror swap or not to mirror swap, that is the question"
    Date: Tue, 31 May 2005 15:17:18 +0100
    To: sunmanagers@sunmanagers.org
    
    

    An Ultra10 running important services is crashing with the following in
    /var/adm/messages :

    Jul 17 21:56:35 oradea ntpd[341]: [ID 702911 daemon.notice] time reset
    0.223050
    s
    e status change 41adea ntpd[341]: [ID 702911 daemon.error] kernel time
    disciplin
    ed Memory Error detected by CPU0, errID 0x0000008b.cf663f42.info] [AFT0]
    Correct
    1014b6902:01:23 oradea AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
    Fault_PC 0x
    Jul 17 22:01:23 oradea UDBH Syndrome 0x2a Memory Module DIMM1
    x0000008b.cf663f42 Corrected Memory Error on DIMM1 is Persistent] [AFT0]
    errID 0
    x0000008b.cf663f42 ECC Data Bit 53 was in error and correctednfo] [AFT0]
    errID 0
    ed Memory Error detected by CPU0, errID 0x0000008b.cf669a2c.info] [AFT0]
    Correct
    1014b6902:01:23 oradea AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
    Fault_PC 0x
    Jul 17 22:01:23 oradea UDBH Syndrome 0x2a Memory Module DIMM1
    x0000008b.cf669a2c Corrected Memory Error on DIMM1 is Persistent] [AFT0]
    errID 0
    x0000008b.cf669a2c ECC Data Bit 53 was in error and correctednfo] [AFT0]
    errID 0
    ed Memory Error detected by CPU0, errID 0x000000b2.b43869ce.info] [AFT0]
    Correct
    100257b42:04:10 oradea AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
    Fault_PC 0x
    Jul 17 22:04:10 oradea UDBH Syndrome 0x6d Memory Module DIMM1
    ors in less than 24:00 (hh:mm) detected from Memory Module DIMM1AFT0] 3
    soft err
    Jul 17 22:04:10 oradea unix: [ID 618185 kern.notice] NOTICE: Scheduling
    removal
    of page 0x00000000.00f80000
    80000 removed from servicex: [ID 693633 kern.notice] NOTICE: Page
    0x00000000.00f

    ......

    Jul 17 22:04:10 oradea unix: [ID 618185 kern.notice] NOTICE: Sched
    of page 0x00000000.00f80000
    x000000b2.b4461676 Corrected Memory Error on DIMM1 is Persistent]
    x000000b2.b4461676 ECC Data Bit 48 was in error and correctednfo]
    ed Memory Error detected by CPU0, errID 0x000000b2.b44a75af.info]
    100257c42:04:10 oradea AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x
    Jul 17 22:04:10 oradea UDBH Syndrome 0x6d Memory Module DIMM1
    ors in less than 24:00 (hh:mm) detected from Memory Module DIMM1AF
    Jul 17 22:04:10 oradea unix: [ID 618185 kern.notice] NOTICE: Sched
    of page 0x00000000.00f80000

    x0000008b.cf669a2c ECC Data Bit 53 was in error and corrected

    ed Memory Error detected by CPU0, errID 0x000000b2.b43869ce

    @

    Jul 17 22:04:10 oradea UDBH Syndrome 0x6d Memory Module DIMM1
    --------------------------------------------------------------------------------

    A Sun engineer has been in to change the memory , but it is still
    crashing. Any ideas
    please ?

    Dominic Clarke
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: Daryl.Mitchell_at_us.o-i.com: "to mirror swap or not to mirror swap, that is the question"

    Relevant Pages

    • Info
      ... [AFT0] ... errID 0x00112472.93bbef64 Corrected Memory Error on Slot B: ... [AFT2] ... I believe this to be a memory error on the first cpu board in the system ...
      (SunManagers)
    • Memory problem on V480 : should I worry ?
      ... [AFT0] Corrected system bus Event detected by CPU2 at ... errID 0x00296097.36024250 Corrected Memory Error on Slot A: ... [AFT2] ... Seems that the memory error was corrected, but I just want to make ...
      (comp.unix.solaris)
    • V880 panicing - urgent help, pls
      ... But it does only medium diagnostics (I've had memeory issues on this ... OBP Alert: Diagnostic/system console is directed to ttya/screen. ... errID 0x00000067.32a8ab58 Corrected Memory Error on Slot B: ...
      (SunManagers)
    • Error on 480 with 1.2GHz cpus....
      ... correctable memory error, but some of the messages are a little ... Simple memory error I shouldn't worry about unless frequency ... errID 0x000ef535.105b8008 Corrected Memory Error on Slot A ... [AFT2] ...
      (SunManagers)
    • memory error message
      ... I am getting following memory error message every 4 hours. ... errID 0x0071c805.80db4948 Corrected Memory Error on Slot A: ...
      (SunManagers)