ok, i need some "3rd party" opionions

From: Mark J. Bailey (mjb_at_jobsoft.com)
Date: 04/26/05


Date: Mon, 25 Apr 2005 21:27:44 -0500

I have a client with an F50 at AIX 5200-01 and firmware:

         ROM Level.(alterable).......L03273
         ROM Level.(non-alterable)...wc010611
         ROM Level.(alterable).......wc010611

they started having the error (below) occur intermittently during a full
system backup to /dev/rmt0 (8mm DAT) with cpio starting back in February
and recurring more and more frequently since then (several times just
this past week). I know what I think the problem is based on the error
below and some additional ones accompanying this one (including a
successful system dump to the primary dump device, /dev/hd7). But, we
are running into some static and (to me) confusion with hardware support
(IBM); they insist this is NOT a hardware fault issue - its a software
issue (um uh ... yeah).

**NOTHING** has changed on this system since 12/2003 when AIX 5.2 was
loaded (no maintenance updates, APARs, PTFs; NO NOTHING) *and* this has
been running without incident from 12/2003 through 02/2005 *and* for the
previous 5-6 years prior to the 12/2003 upgrade to AIX 5.2. All
recommended F50 and other microcode updates were applied at the AIX 5.2
upgrade time as per invscout and IBM Microcode Discovery Service. Oh
yeah, support said it was bad tapes, so we bought all new tapes...guess
what?! :-)

So, the short of it is that this appeared out of the blue. The box is
slated for replacement the end of this calendar year. We need to make
it last until then. Failing backups is not a good thing. We also
started to see hostmibd aborting shortly after reboot with Signal 11.
This may or may not have anything to do with my main issue herein.

I would love some comments! What course of action does this imply to
the others here??? Basically, I need some ammo to push/insist harder if
need be, or to be equally educated in the other direction and just
shutup! :-) Honestly, call me lucky maybe, but I have never had this
much hassle from IBM before. Its not setting well.

Thanks ever so much!

Mark

---------------------------------------------------------------------------
LABEL: MACHINE_CHECK_CHRP
IDENTIFIER: 56CDC3C8

Date/Time: Tue Apr 19 00:11:03 CDT
Sequence Number: 28264
Machine Id: 000211174C00
Node Id: cvaf50
Class: H
Type: PERM
Resource Name: sysplanar0
Resource Class: planar
Resource Type: sysplanar_rspc
Location:

Description
MACHINE CHECK

Probable Causes
UNDETERMINED

Failure Causes
PROCESSOR MACHINE CHECK

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.

Detail Data
MACHINE STATUS SAVE/RESTORE REGISTER 0
0000 0000 0002 07E0
MACHINE STATUS SAVE/RESTORE REGISTER 1
0000 0000 0008 9032
PROBLEM DATA
0194 4000 0000 0078 C600 8200 0511 0700 2005 0419 0008 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 4942 4D00 5031 2050 3200 0000
000C 4646
0000 FF00 0008 9032 000C 4646 0100 FF00 0000 0000 000C 4646 0200 0005
0020 8C00
000C 4646 0200 0004 0040 0400 000C 4646 0200 0000 8000 0000 000C 4646
0300 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000

Diagnostic Analysis
Diagnostic Log sequence number: 7596
Resource tested: sysplanar0
Resource Description: System Planar
Location:
SRN: 651-725
Description: I/O Host Bridge address/data parity error
Possible FRUs:
    n/a FRU: 73H1925 P1
    n/a FRU: 07L6594 P2



Relevant Pages

  • Re: Clone a disk to tape
    ... Check out Storix System Backup Administrator for AIX. ... using a bootable tape and clear menu-driven interface. ...
    (comp.unix.aix)
  • Re: Supported hardware / Release notes.
    ... The F50 was the first thing that came to mind here. ... > what version of AIX supports what hardware? ... letter lists what hardware are no longer supported as of that version. ... latest patches applied for best device support and minimum of problems. ...
    (comp.unix.aix)
  • Re: device driver 4.3.3
    ... AIX 5.3 on the F50 it sees the ethernet card but, when i install 4.3.3 ... I do have these two filesets in a different F50 server but not sure how ... You mean the package or the installed package? ...
    (comp.unix.aix)
  • about AIX 5300-05-02
    ... any adviced about AIX 5300-05-02, ... Resource Name: sysplanar0 ... RUN SYSTEM DIAGNOSTICS. ... Diagnostic Log sequence number: 180 ...
    (comp.unix.aix)
  • Re: device driver 4.3.3
    ... AIX 5.3 on the F50 it sees the ethernet card but, when i install 4.3.3 ... I do have these two filesets in a different F50 server but not sure how ... You mean the package or the installed package? ...
    (comp.unix.aix)