help! picld error - is it a hardware issue?

From: ktn (ktn_at_dodo.com.au)
Date: 06/11/04

  • Next message: Burtenshaw, Craig: "Command to find available ethernet devices."
    Date: Fri, 11 Jun 2004 9:55:06 +1000
    To: sunmanagers@sunmanagers.org
    
    

    Dear managers, need your prompt help!

    I've been getting these errors in /var/adm/messages constantly since a
    reboot a machine, a Sunfire v880 running Solaris 9 Generic_112233-12 (due to
    a power failure by the way) --

    ...
    Jun 11 03:12:12 serv picld[93]: [ID 710302 daemon.error] I/O error
    Jun 11 03:12:13 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
    9992)
    Jun 11 03:12:13 serv picld[93]: [ID 710302 daemon.error] I/O error
    Jun 11 03:12:15 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on IO_BRIDGE_PRIM_FAN
    (2500216)
    Jun 11 03:12:15 serv picld[93]: [ID 710302 daemon.error] I/O error
    Jun 11 03:12:48 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on CPU0_PRIM_FAN (249
    9960)
    Jun 11 03:12:48 serv picld[93]: [ID 710302 daemon.error] I/O error
    Jun 11 03:12:49 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
    9992)
    Jun 11 03:12:49 serv picld[93]: [ID 710302 daemon.error] I/O error
    Jun 11 03:12:51 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on IO_BRIDGE_PRIM_FAN
    (2500216)
    ...

    In the logs during the reboot, the "PS2 Device unplugged" is the last error
    picld gives...could this be a cause of the problem? --
    ...
    May 30 20:37:57 serv eri: [ID 517527 kern.info] SUNW,eri0 : 100 Mbps full
    duplex link up
    May 30 20:38:00 serv last message repeated 1 time
    May 30 20:38:02 serv pseudo: [ID 129642 kern.info] pseudo-device: devinfo0
    May 30 20:38:02 serv genunix: [ID 936769 kern.info] devinfo0 is
    /pseudo/devinfo@0
    May 30 20:42:23 serv picld[93]: [ID 293134 daemon.error] Device PS2
    unplugged
    May 30 20:42:50 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
    Mesg: Unable to load configuration '/var/ldap/
    ldap_client_file' ('').
    May 30 20:42:50 serv last message repeated 3 times
    May 30 20:42:50 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on CPU0_PRIM_FAN (249
    9960)
    May 30 20:42:50 serv picld[93]: [ID 875627 daemon.error] No such file or
    directory
    May 30 20:42:51 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
    Mesg: Unable to load configuration '/var/ldap/
    ldap_client_file' ('').
    May 30 20:42:51 serv last message repeated 5 times
    May 30 20:42:52 serv picld[93]: [ID 478985 daemon.error] ERROR running
    psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
    9992)
    May 30 20:42:52 serv picld[93]: [ID 875627 daemon.error] No such file or
    directory
    May 30 20:42:53 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
    Mesg: Unable to load configuration '/var/ldap/
    ldap_client_file' ('').
    May 30 20:42:53 serv last message repeated 2 times
    ...

    Running prtdiag shows the following, and the "no memory" part is giving me a
    heart attack. Could this just be (from the logs above), an incomplete boot?
    I am thinking of rebooting the machine and seeing if it will be the same, or
    do you think it's something failing for sure? Many thanks in advance for
    reading. Will summarise.

    >prtdiag -v
    System Configuration: Sun Microsystems sun4u Sun Fire 880
    System clock frequency: 150 MHz
    Memory size: 8192 Megabytes

    ========================= CPUs
    ===============================================

    Run E$ CPU CPU
    Brd CPU MHz MB Impl. Mask
    --- --- ---- ---- ------- ----
    A 0 750 8.0 US-III 5.4
    B 1 750 8.0 US-III 5.4
    A 2 750 8.0 US-III 5.4
    B 3 750 8.0 US-III 5.4

    ========================= Memory Configuration
    ===============================

    Logical Logical Logical
    MC Bank Bank Bank DIMM Interleave Interleaved
    Brd ID num size Status Size Factor with
    ---- --- ---- ------ ----------- ------ ---------- -----------
    Cannot find any memory bank/segment info.

    ========================= IO Cards =========================

    Bus Max
    IO Port Bus Freq Bus Dev,
    Brd Type ID Side Slot MHz Freq Func State Name
       Model
    ---- ---- ---- ---- ---- ---- ---- ---- -----
    -------------------------------- ----------------------
    I/O PCI 9 A 8 33 66 1,0 ok SUNW,m64B
       SUNW,370-4362

    No failures found in System
    ===========================

    ========================= Environmental Status =========================

    System Temperatures (Celsius):
    -------------------------------
    Device Temperature Status
    ---------------------------------------
    CPU0 68 OK
    CPU1 73 OK
    CPU2 59 OK
    CPU3 61 OK
    MB 31 OK
    IOB 26 OK
    DBP0 28 OK

    =================================

    Front Status Panel:
    -------------------
    Keyswitch position: NORMAL

    System LED Status:
    GEN FAULT REMOVE
    [OFF] [OFF]

    DISK FAULT POWER FAULT
    [OFF] [OFF]

    LEFT THERMAL FAULT RIGHT THERMAL FAULT
    [OFF] [OFF]

    LEFT DOOR RIGHT DOOR
    [OFF] [OFF]

    =================================

    Disk Status:
    Presence Fault LED Remove LED
    DISK 0: [PRESENT] [OFF] [OFF]
    DISK 1: [PRESENT] [OFF] [OFF]
    DISK 2: [PRESENT] [OFF] [OFF]
    DISK 3: [PRESENT] [OFF] [OFF]
    DISK 4: [PRESENT] [OFF] [OFF]
    DISK 5: [PRESENT] [OFF] [OFF]
    DISK 6: [ EMPTY]
    DISK 7: [ EMPTY]
    DISK 8: [ EMPTY]
    DISK 9: [ EMPTY]
    DISK 10: [ EMPTY]
    DISK 11: [ EMPTY]

    =================================

    Fan Bank :
    ----------

    Bank Speed Status Fan State
    ( RPMS )
    ---- -------- --------- ---------
    CPU0_PRIM_FAN 1298089537 [ENABLED] OK
    CPU1_PRIM_FAN 1298089537 [ENABLED] OK
    CPU0_SEC_FAN 0 [DISABLED] OK
    CPU1_SEC_FAN 0 [DISABLED] OK
    IO0_PRIM_FAN 4000 [ENABLED] OK
    IO1_PRIM_FAN 3947 [ENABLED] OK
    IO0_SEC_FAN 0 [DISABLED] OK
    IO1_SEC_FAN 0 [DISABLED] OK
    IO_BRIDGE_PRIM_FANfailed in picl_get_propval_by_name for fan speed
    General system failure
    Power Supplies:
    ---------------

    Supply Status Fan Fail Temp Fail CS Fail 3.3V 5V 12V 48V
    ------ ------------ -------- --------- ------- ---- -- --- ---
    PS0 GOOD 9 4 3 5
    PS1 GOOD 9 3 3 5
    PS2 UNPLUGGED

    ========================= HW Revisions
    =======================================

    System PROM revisions:
    ----------------------
    OBP 4.5.6 2002/01/04 12:30

    IO ASIC revisions:
    ------------------
    Port
    Brd Model ID Status Version
    ---- --------------- ---- ------ -------
    IB-1 unknown 8 ok 4
    IB-1 unknown 9 ok 4

    ________________________________________________

    Message sent using Dodo
    Internet Webmail Server
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: Burtenshaw, Craig: "Command to find available ethernet devices."

    Relevant Pages

    • tell Bob its filthy shouting on a tag
      ... recommend the blunt puddle. ... Tell Andrew it's empty attempting ... over a disk. ...
      (uk.sport.football.clubs.liverpool)
    • Re: Disk Scanning & Defragmentation
      ... perish the though of my trying to say anything doesn't play a role ... subsequently leading to failure? ... And of course fewer disks wear out due to excessive defragging. ... In my experience the reasons for disk ...
      (microsoft.public.windowsxp.general)
    • [HPADM] Re: [hpadm] disk problem
      ... Please check the disk using ioscan, ... Disk at hardware path 10/12.9.0: Hardware failure ... Product Identifier: SCSI Disk ...
      (HP-UX-Admin)
    • Re: Disk Scanning & Defragmentation
      ... to be assuming that there was only that one single failure mechanism ... the amount of use, specifically the amount of head movement, does ... And of course fewer disks wear out due to excessive defragging. ... disk failure. ...
      (microsoft.public.windowsxp.general)
    • reliable disk FAILURE
      ... I don't think it's the disk because it happens on six different ... commands causing the failure, console output and dmesg are below. ... pci0: <ACPI PCI bus> on pcib0 ...
      (freebsd-current)

    Loading