Hard Lock/SCSI CAM ERROR/5.1B/ES40/HSZ70

From: David Knight (dknight_at_fitzandfloyd.com)
Date: 08/25/03

  • Next message: Tom Linden: "SUMMARY: admin lock on login"
    Date: Mon, 25 Aug 2003 10:17:46 -0500
    To: tru64-unix-managers@ornl.gov
    
    

    Managers,
            I received a hard lock of my Alpha server this morning ( ES40/HSZ70 (RA7000)/ 5.1B PK4) Keyboard/consol not responding/no ping/halt button had no effect. hit the restart button on the ES40 and on reboot (firmware check) received mem errors on the LCD then the system stopped booting before I ever got my consol. I then preformed a cold boot of the system with the halt button in to get the SRM. at the SRM I preformed test mem/etc and received no errors. I then continued to boot the system (rc3) with success. I have no errors in any of my OS logs/alert logs/ no core files. The only errors I found were in by binary error log (Below). the errors talk about scsi cam lun0 target1 witch is on my HSZ70. From the RA7000, show shows that the state is good/no errors on this lun/target (R5). Correct me if I'm wrong but scsi cam errors wouldn't cause a system lock. I would think I would at least get a kernel panic out of the deal.
    Any thoughts/leads on my issue would be greatly appreciated.

    Thanks,
    David Knight

    UERF:

    ----- EVENT INFORMATION -----
    EVENT CLASS ERROR EVENT
    OS EVENT TYPE 199. CAM SCSI
    SEQUENCE NUMBER 2263.
    OPERATING SYSTEM DEC OSF/1
    OCCURRED/LOGGED ON Sun Aug 24 04:44:36 2003
    OCCURRED ON SYSTEM alpha0
    SYSTEM ID x000D0022
    SYSTYPE x00000000
    PROCESSOR COUNT 2.
    PROCESSOR WHO LOGGED x00000000
    ----- UNIT INFORMATION -----
    CLASS x0037
    SUBSYSTEM x0037
    BUS # x0000
                                  x0008 LUN x0
                                            TARGET x1

    _____________________________________________________

    ======================= Binary Error Log event =======================
    EVM event name: sys.unix.binlog.hw.scsi

        Binary error log events are posted through the binlogd daemon, and
        stored in the binary error log file, /var/adm/binary.errlog. This
        event is used to report all SCSI device errors, including disk,
        tape, HSZ raid events and adapter errors.

        Action: Use Compaq Analyze or DECevent to read and analyze the
        system error log to determine if a SCSI device may need to be
        replaced.

    ======================================================================

    Formatted Message:
        SCSI event

    Event Data Items:
        Event Name : sys.unix.binlog.hw.scsi
        Priority : 700
        PID : 466
        PPID : 1
        Event Id : 1660
        Timestamp : 25-Aug-2003 06:03:04
        Host IP address : 10.34.80.2
        Host Name : alpha0
        User Name : root
        Format : SCSI event
        Reference : cat:evmexp.cat:300

    Variable Items:
        subid_class (INT32) = 199
        subid_num (INT32) = 0
        subid_unit_num (INT32) = 8
        subid_type (INT32) = 34
        binlog_event (OPAQUE) = [OPAQUE VALUE: 1224 bytes]

    ============================ Translation =============================
    Sequence number of error: -129694387
    Time of error entry: 25-Aug-2003 06:03:04
    Host name: alpha0

    SCSI CAM ERROR PACKET
    SCSI device class: DEC SIM
    Bus Number: 0
    Target number: 1
    Lun Number: 0

    Name of routine that logged the event: ss_perform_timeout
    Event information: timeout on disconnected request

                    ############### Entry End ###############

    Event information: Active CCB at time of error

                    ############### Entry End ###############

    ======================================================================


  • Next message: Tom Linden: "SUMMARY: admin lock on login"
  • Quantcast