MegaRAID 'Bad Slot' Kernel message and crash.

From: Tony Byrne (freebsd_at_byrnehq.com)
Date: 12/29/04

  • Next message: Pertti Kosunen: "Re: TIMEOUT - WRITE_DMA - A possible FIX! turn off ACPI"
    Date: Wed, 29 Dec 2004 11:18:55 +0000
    To: freebsd-stable@freebsd.org
    
    

    Folks,

    We have a 4.10-STABLE production server which has an Intel SRCU42X
    RAID controller installed:

    amr0: <LSILogic MegaRAID> mem 0xfe580000-0xfe5fffff,0xfbef0000-0xfbefffff irq 22 at device 0.0 on pci4
    amr0: <LSILogic Intel(R) RAID Controller SRCU42X> Firmware 411M, BIOS H404, 128MB RAM

    The server crashed yesterday in the small hours of the morning and
    when we arrived on site to reboot it, there was a "bad slot" kernel
    message on the console, which places the RAID controller in the frame.

    The amr driver man page says that this message is indicative of a firmware or
    hardware problem with the controller, but we are not convinced. We
    experienced the same message and lockups daily during stress testing of the
    box under FreeBSD 5.3 and this ultimately forced us to 'downgrade'
    to 4.10 for production. The box had been rock solid under 4.10 for a
    number of weeks before yesterday's crash.

    Could this indicate a bug in the driver, or at least in its support
    for our re-badged RAID controller? Has anyone else had problems with
    the amr driver with the same card?

    Many thanks,

    Regards,

    Tony.

    -- 
    Tony Byrne
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
    

  • Next message: Pertti Kosunen: "Re: TIMEOUT - WRITE_DMA - A possible FIX! turn off ACPI"