FBSD 5.1-RELEASE-p2 crashes/SMP wont work

From: Hartmann, O. (ohartman_at_klima.physik.uni-mainz.de)
Date: 08/11/03

  • Next message: Martin Váòa: "resizing partition"
    Date: Mon, 11 Aug 2003 11:20:44 +0200 (CEST)
    To: freebsd-questions@freebsd.org
    
    

    Hello.

    Since we upgraded our SMP server (TYAN Thunder 2500 based system) from FreeBSD 4.8
    to FreeBSD 5.1-RELEASE the machine crashed sporadicaly while in heavy load or wont
    start after recognition of the AMI Enterprise 1600 RAID controller!

    Kernel of FBSD 5.1-RELEASE start in single user mode, 5.1-RELEASE-p2 doesn't!

    At this moment, the only working kernel is a 5.1-CURRENT kernel from two weeks ago
    (see dmesg output below).

    FreeBSD 5.1-RELEASE worked for a while, but when samba started and under heavy load
    the system crashes (I got no error message, sorry).

    FreeBSD 5.1-RELEASE-p2 doesn't want to start anymore! The last line I see while kernel is
    booting is this:

    amrd0: <LSILogic MegaRAID logical drive> on amr0
    amrd0: 245014MB (501788672 sectors) RAID 5 (optimal)

    and it freezes forever.

    Sometimes I see this message below the last line:

    amr0: bad slot 2 completed

    or

    amr0: bad slot 15 completed

    What does it mean? Is this something like a problem in IRQ routing?

    normaly, after the RAID controler has been recognized, a message about the launched second CPU
    shows up.

    Using the most recent freeBSD 5.1-CURRENT stuff is impossible on our machine, it freezes completely after a while
    or does a spontanous reboot (earlier versions did not!).

    Is any help available?

    Another couriosity is that kernels build with SCHED_ULE freeze much faster than those build with
    SCHED_4BSD, but SCHED_ULE kernels seem to boot, while SCHED_4BSD kernels sometimes do not.

    Tnaks a lot for your help.

    This is dmesg of the running and obviously working kernel:

    Copyright (c) 1992-2003 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    FreeBSD 5.1-CURRENT #2: Fri Jul 25 11:45:43 GMT 2003
        root@atmos.physik.uni-mainz.de:/usr/obj/usr/src/sys/ATMOS
    Timecounter "i8254" frequency 1193182 Hz
    Timecounter "TSC" frequency 868644587 Hz
    CPU: Intel Pentium III (868.64-MHz 686-class CPU)
      Origin = "GenuineIntel" Id = 0x683 Stepping = 3
      Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
    real memory = 2147483648 (2048 MB)
    avail memory = 2086006784 (1989 MB)
    Programming 16 pins in IOAPIC #0
    IOAPIC #0 intpin 2 -> irq 0
    Programming 16 pins in IOAPIC #1
    FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
     cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000
     cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000
     io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000
     io1 (APIC): apic id: 3, version: 0x000f0011, at 0xfec01000
    netsmb_dev: loaded
    Pentium Pro MTRR support enabled
    npx0: <math processor> on motherboard
    npx0: INT 16 interface
    pcibios: BIOS version 2.10
    Using $PIR table, 12 entries at 0xc00fdf00
    pcib0: <Host to PCI bridge> at pcibus 0 on motherboard
    pci0: <PCI bus> on pcib0
    IOAPIC #1 intpin 13 -> irq 2
    IOAPIC #1 intpin 12 -> irq 16
    pcib1: <PCIBIOS PCI-PCI bridge> at device 0.1 on pci0
    pci1: <PCI bus> on pcib1
    IOAPIC #1 intpin 1 -> irq 17
    pci1: <display, VGA> at device 0.0 (no driver attached)
    sym0: <896> port 0xf800-0xf8ff mem 0xfeafe000-0xfeafffff,0xfeafac00-0xfeafafff irq 2 at device 1.0 on pci0
    sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
    sym0: open drain IRQ line driver, using on-chip SRAM
    sym0: using LOAD/STORE-based firmware.
    sym0: handling phase mismatch from SCRIPTS.
    sym1: <896> port 0xf400-0xf4ff mem 0xfeafc000-0xfeafdfff,0xfeafa800-0xfeafabff irq 16 at device 1.1 on pci0
    sym1: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking
    sym1: open drain IRQ line driver, using on-chip SRAM
    sym1: using LOAD/STORE-based firmware.
    sym1: handling phase mismatch from SCRIPTS.
    isab0: <PCI-ISA bridge> port 0x500-0x50f at device 15.0 on pci0
    isa0: <ISA bus> on isab0
    pci0: <mass storage, ATA> at device 15.1 (no driver attached)
    pcib2: <ServerWorks host to PCI bridge> at pcibus 2 on motherboard
    pci2: <PCI bus> on pcib2
    IOAPIC #1 intpin 8 -> irq 18
    em0: <Intel(R) PRO/1000 Network Connection, Version - 1.6.6> port 0xf0c0-0xf0ff mem 0xf7ee0000-0xf7efffff irq 18 at device 1.0 on pci2
    em0: Speed:N/A Duplex:N/A
    pcib3: <PCI-PCI bridge> at device 2.0 on pci2
    pci3: <PCI bus> on pcib3
    IOAPIC #1 intpin 11 -> irq 19
    pcib4: <PCI-PCI bridge> at device 0.0 on pci3
    pci4: <PCI bus> on pcib4
    IOAPIC #1 intpin 10 -> irq 20
    amr0: <LSILogic MegaRAID> mem 0xf0000000-0xf3ffffff irq 20 at device 0.0 on pci4
    amr0: <LSILogic MegaRAID Enterprise 1600> Firmware G170, BIOS F316, 64MB RAM
    pci3: <mass storage, SCSI> at device 1.0 (no driver attached)
    pci3: <mass storage, SCSI> at device 2.0 (no driver attached)
    orm0: <Option ROMs> at iomem 0xca000-0xcdfff,0xc0000-0xc9fff on isa0
    fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
    fdc0: FIFO enabled, 8 bytes threshold
    fd0: <1440-KB 3.5" drive> on fdc0 drive 0
    atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
    atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
    kbd0 at atkbd0
    psm0: <PS/2 Mouse> irq 12 on atkbdc0
    psm0: model IntelliMouse, device ID 3
    vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    sc0: <System console> at flags 0x100 on isa0
    sc0: VGA <8 virtual consoles, flags=0x300>
    sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
    sio0: type 16550A
    sio1 at port 0x2f8-0x2ff irq 3 on isa0
    sio1: type 16550A
    unknown: <PNP0303> can't assign resources (port)
    psmcpnp0: irq resource info is missing; assuming irq 12
    unknown: <PNP0501> can't assign resources (port)
    unknown: <PNP0501> can't assign resources (port)
    unknown: <PNP0700> can't assign resources (port)
    APIC_IO: Testing 8254 interrupt delivery
    APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2
    APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
    Timecounters tick every 1.000 msec
    ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited
    DUMMYNET initialized (011031)
    Waiting 5 seconds for SCSI devices to settle
    (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
    (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
    amrd0: <LSILogic MegaRAID logical drive> on amr0
    amrd0: 245014MB (501788672 sectors) RAID 5 (optimal)
    amr0: bad slot 2 completed
    sa0 at sym1 bus 0 target 5 lun 0
    sa0: <HP C5713A H910> Removable Sequential Access SCSI-2 device
    sa0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
    ch0 at sym1 bus 0 target 5 lun 1
    ch0: <HP C5713A H910> Removable Changer SCSI-2 device
    ch0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
    ch0: 6 slots, 1 drive, 0 pickers, 0 portals
    SMP: AP CPU #1 Launched!
    cd0 at sym0 bus 0 target 3 lun 0
    cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device
    cd0: 20.000MB/s transfers (20.000MHz, offset 16)
    cd0: Attempt to query device size failed: NOT READY, Medium not present
    Mounting root from ufs:amr0s1a
    setrootbyname failed
    ffs_mountroot: can't find rootvp
    Root mount failed: 6

    Manual root filesystem specification:
      <fstype>:<device> Mount <device> using filesystem <fstype>
                           eg. ufs:da0s1a
      ? List valid disk boot devices
      <empty line> Abort manual input

    mountroot> ufs:amrd0s1a
    Mounting root from ufs:amrd0s1a
    WARNING: /usr/local was not properly dismounted
    WARNING: /usr/obj was not properly dismounted
    WARNING: /usr/src was not properly dismounted
    WARNING: /var was not properly dismounted
    link_elf: symbol swapblist undefined
    KLD linprocfs.ko: depends on linux - not available
    em0: Link is up 1000 Mbps Full Duplex

    --
    MfG
    O. Hartmann
    ohartman@mail.physik.uni-mainz.de
    ------------------------------------------------------------------
    Systemadministration des Institutes fuer Physik der Atmosphaere (IPA)
    ------------------------------------------------------------------
    Johannes Gutenberg Universitaet Mainz
    Becherweg 21
    55099 Mainz
    Tel: +496131/3924662 (Maschinenraum)
    Tel: +496131/3924144 (Buero)
    FAX: +496131/3923532
    _______________________________________________
    freebsd-questions@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
    

  • Next message: Martin Váòa: "resizing partition"

    Relevant Pages

    • FreeBSD Status report for Oct-Dec 2003
      ... Bluetooth stack for FreeBSD ... Not much to report. ... Bluetooth kernel modules appear to be stable. ... concerns and some src committers are willing to commit the patches. ...
      (freebsd-current)
    • FreeBSD Status Report for Oct-Dec 2003
      ... Bluetooth stack for FreeBSD ... Not much to report. ... Bluetooth kernel modules appear to be stable. ... concerns and some src committers are willing to commit the patches. ...
      (freebsd-hackers)
    • FreeBSD Status Report for Oct-Dec 2003
      ... Bluetooth stack for FreeBSD ... Not much to report. ... Bluetooth kernel modules appear to be stable. ... concerns and some src committers are willing to commit the patches. ...
      (freebsd-stable)
    • cant boot from 4.8-RELEASE distribution CD
      ... I have a FreeBSD 4.1 system, which I am trying to upgrade to 4.8-RELEASE. ... it can't boot from the 4.8 distribution CD. ... Preloaded elf kernel "kernel" at 0xc040d000. ... pci0: <PCI bus> on pcib0 ...
      (freebsd-questions)
    • RE: FreeBSD 4.11 P13 Crash
      ... I do not want to jinx myself, but after back revving to FreeBSD 4.9 + ... think it is related to IPFilter in conjunction with 4 Intel nics and/or ... page fault while in kernel mode ... Okay this time my kernel was recompiled so there are no modules to ...
      (freebsd-hackers)