Panic on shutdown -r now (RESOLVED?)

From: David Boyd (David.Boyd_at_insightbb.com)
Date: 11/30/04

  • Next message: FreeBSD Tinderbox: "[current tinderbox] failure on i386/i386"
    To: <freebsd-current@freebsd.org>
    Date: Tue, 30 Nov 2004 15:58:24 -0500
    
    

    The following problem was reported (by me and others) from about 5.3-BETA4
    through 5.3-RELEASE.

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    ++++++++++++++++++++
    This problem persisted into 5.3-RELEASE. It may be related to panics
    reported by
    others.

    The problem appears to be related to SMP and HTT. It doesn't occur (for me)
    with GENERIC.

    It has been very difficult to obtain a "usable" dump. The system is usually
    locked tight.

    Kernel is built with KDB DDB and BREAK_TO_DEBUGGER. Even when the system
    gets as far as indicating that the panic has occurred, it seldom enters the
    debugger. Usually, when it does enter the debugger, the system ignores any
    key input, echoing colon or semicolon when the ENTER key is depressed.

    Oh, yeah! Once every fifty or so times the system will reboot normally.

    This problem started during BETA testing ... back around BETA4 or BETA5 as I
    recall.

    Here's what I have for today (system is from RC2 ISO image).

    from serial console:

    The garbage in the display after "Shutting down ACPI" is "normal" to this
    problem.

    ============================================================================
    =================
    KDB: debugger backends: ddb
    KDB: current backend: ddb
    Copyright (c) 1992-2004 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    FreeBSD 5.3-RC2 #0: Mon Nov 1 14:48:42 EST 2004
        root@comm-server.support.bsd1.net:/usr/src/sys/i386/compile/DEBUG
    ACPI APIC Table: <INTEL PRODUCT8>
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2394.01-MHz 686-class CPU)
      Origin = "GenuineIntel" Id = 0xf29 Stepping = 9
      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,M
    CA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
      Hyperthreading: 2 logical CPUs
    real memory = 534970368 (510 MB)
    avail memory = 513937408 (490 MB)
    FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
     cpu0 (BSP): APIC ID: 0
     cpu1 (AP): APIC ID: 1
    ioapic0 <Version 2.0> irqs 0-23 on motherboard
    npx0: [FAST]
    npx0: <math processor> on motherboard
    npx0: INT 16 interface
    acpi0: <INTEL PRODUCT8> on motherboard
    acpi0: Power Button (fixed)
    Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
    acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
    cpu0: <ACPI CPU> on acpi0
    cpu1: <ACPI CPU> on acpi0
    cpu1: Failed to attach throttling P_CNT
    pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
    pci0: <ACPI PCI bus> on pcib0
    agp0: <Intel 82865 host to AGP bridge> mem 0xf8000000-0xfbffffff at device
    0.0 on pci0
    pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
    pci1: <ACPI PCI bus> on pcib1
    pci1: <display, VGA> at device 0.0 (no driver attached)
    uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xcc00-0xcc1f irq 16
    at device 29.0 on pci0
    uhci0: [GIANT-LOCKED]
    usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
    usb0: USB revision 1.0
    uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub0: 2 ports with 2 removable, self powered
    uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xd000-0xd01f irq 19
    at device 29.1 on pci0
    uhci1: [GIANT-LOCKED]
    usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
    usb1: USB revision 1.0
    uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub1: 2 ports with 2 removable, self powered
    uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xd400-0xd41f irq 18
    at device 29.2 on pci0
    uhci2: [GIANT-LOCKED]
    usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
    usb2: USB revision 1.0
    uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub2: 2 ports with 2 removable, self powered
    uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xd800-0xd81f irq 16
    at device 29.3 on pci0
    uhci3: [GIANT-LOCKED]
    usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
    usb3: USB revision 1.0
    uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub3: 2 ports with 2 removable, self powered
    pci0: <serial bus, USB> at device 29.7 (no driver attached)
    pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0
    pci2: <ACPI PCI bus> on pcib2
    atapci0: <Promise PDC20270 UDMA100 controller> port
    0xac00-0xac0f,0xb000-0xb003,0xb400-0xb407,0xb800-0xb803,0xbc00-0xbc07 mem
    0xfeaf0000-0xfeafffff irq 17 at device 2.0 on pci2
    ata2: channel #0 on atapci0
    ata3: channel #1 on atapci0
    rl0: <D-Link DFE-530TX+ 10/100BaseTX> port 0xa800-0xa8ff mem
    0xfeadfc00-0xfeadfcff irq 19 at device 3.0 on pci2
    miibus0: <MII bus> on rl0
    rlphy0: <RealTek internal media interface> on miibus0
    rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    rl0: Ethernet address: 00:0d:88:35:39:a0
    rl1: <D-Link DFE-530TX+ 10/100BaseTX> port 0xa400-0xa4ff mem
    0xfeadf800-0xfeadf8ff irq 18 at device 4.0 on pci2
    miibus1: <MII bus> on rl1
    rlphy1: <RealTek internal media interface> on miibus1
    rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    rl1: Ethernet address: 00:0d:88:37:d7:ba
    fxp0: <Intel 82801BA (D865) Pro/100 VE Ethernet> port 0xa000-0xa03f mem
    0xfeade000-0xfeadefff irq 20 at device 8.0 on pci2
    miibus2: <MII bus> on fxp0
    inphy0: <i82562ET 10/100 media interface> on miibus2
    inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    fxp0: Ethernet address: 00:11:11:0a:46:7b
    isab0: <PCI-ISA bridge> at device 31.0 on pci0
    isa0: <ISA bus> on isab0
    atapci1: <Intel ICH5 UDMA100 controller> port
    0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0
    ata0: channel #0 on atapci1
    ata1: channel #1 on atapci1
    atapci2: <Intel ICH5 SATA150 controller> port
    0xdc00-0xdc0f,0xe000-0xe003,0xe400-0xe407,0xe800-0xe803,0xec00-0xec07 irq 18
    at device 31.2 on pci0
    ata4: channel #0 on atapci2
    ata5: channel #1 on atapci2
    pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
    pci0: <multimedia, audio> at device 31.5 (no driver attached)
    acpi_button0: <Sleep Button> on acpi0
    atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
    atkbd0: <AT Keyboard> irq 1 on atkbdc0
    kbd0 at atkbd0
    atkbd0: [GIANT-LOCKED]
    psm0: <PS/2 Mouse> irq 12 on atkbdc0
    psm0: [GIANT-LOCKED]
    psm0: model IntelliMouse Explorer, device ID 4
    fdc0: <floppy drive controller> port
    0x3f7,0x3f4-0x3f5,0x3f2-0x3f3,0x3f0-0x3f1 irq 6 drq 2 on acpi0
    fdc0: [FAST]
    fd0: <1440-KB 3.5" drive> on fdc0 drive 0
    sio0: configured irq 4 not in bitmap of probed irqs 0
    sio0: port may not be enabled
    sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
    acpi0
    sio0: type 16550A, console
    ppc0: <Standard parallel printer port> port 0x378-0x37f irq 7 on acpi0
    ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
    ppbus0: <Parallel port bus> on ppc0
    plip0: <PLIP network interface> on ppbus0
    lpt0: <Printer> on ppbus0
    lpt0: Interrupt-driven port
    ppi0: <Parallel I/O> on ppbus0
    orm0: <ISA Option ROMs> at iomem
    0xd6800-0xd77ff,0xd5800-0xd67ff,0xcc000-0xd57ff on isa0
    pmtimer0 on isa0
    sc0: <System console> at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x100>
    sio1: configured irq 3 not in bitmap of probed irqs 0
    sio1: port may not be enabled
    vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    Timecounters tick every 10.000 msec
    acpi_cpu: throttling enabled, 8 steps (100% to 12.5%), currently 100.0%
    acd0: CDROM <CREATIVE CD5233E-N/0.20> at ata0-master UDMA33
    ata1-slave: FAILURE - ATA_IDENTIFY timed out
    ata1-slave: FAILURE - ATA_IDENTIFY timed out
    ata1-master: FAILURE - SETFEATURES SET TRANSFER MODE status=1<ERROR>
    error=4<ABORTED>
    ata1-slave: FAILURE - ATA_IDENTIFY timed out
    ata1-master: FAILURE - SETFEATURES SET TRANSFER MODE status=1<ERROR>
    error=4<ABORTED>
    afd0: REMOVABLE <IOMEGA ZIP 100 ATAPI/03.H> at ata1-master BIOSPIO
    ad4: 76319MB <ST380011A/3.06> [155061/16/63] at ata2-master UDMA100
    ad6: 76319MB <ST380011A/3.06> [155061/16/63] at ata3-master UDMA100
    ar0: 76319MB <ATA RAID1 array> [9729/255/63] status: READY subdisks:
     disk0 READY on ad4 at ata2-master
     disk1 READY on ad6 at ata3-master
    SMP: AP CPU #1 Launched!
    Mounting root from ufs:/dev/ar0s1a
    Pre-seeding PRNG: kickstart.
    Loading configuration files.
    Entropy harvesting: interrupts ethernet point_to_point kickstart.
    kernel dumps on /dev/ar0s1b
    swapon: adding /dev/ar0s1b as swap device
    Starting file system checks:
    /dev/ar0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1a: clean, 2000849 free (585 frags, 250033 blocks, 0.0%
    fragmentation)
    /dev/ar0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1d: clean, 3419753 free (40945 frags, 422351 blocks, 1.0%
    fragmentation)
    /dev/ar0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1e: clean, 8121013 free (461 frags, 1015069 blocks, 0.0%
    fragmentation)
    /dev/ar0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1f: clean, 4061052 free (28 frags, 507628 blocks, 0.0%
    fragmentation)
    /dev/ar0s1g: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1g: clean, 2029028 free (28 frags, 253625 blocks, 0.0%
    fragmentation)
    /dev/ar0s1h: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s1h: clean, 3045027 free (27 frags, 380625 blocks, 0.0%
    fragmentation)
    /dev/ar0s2d: FILE SYSTEM CLEAN; SKIPPING CHECKS
    /dev/ar0s2d: clean, 13470149 free (21 frags, 1683766 blocks, 0.0%
    fragmentation)
    Setting hostname: comm-server.support.bsd1.net.
    lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
            inet 127.0.0.1 netmask 0xff000000
            inet6 ::1 prefixlen 128
            inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
    Starting dhclient.
    fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
            options=8<VLAN_MTU>
            inet6 fe80::211:11ff:fe0a:467b%fxp0 prefixlen 64 scopeid 0x3
            inet 192.168.210.51 netmask 0xffffff00 broadcast 192.168.210.255
            ether 00:11:11:0a:46:7b
            media: Ethernet autoselect (100baseTX <full-duplex>)
            status: active
    Additional routing options: IP gateway=YES.
    Starting devd.
    Mounting NFS file systems:.
    Starting syslogd.
    Nov 1 15:56:44 comm-server syslogd: kernel boot file is /boot/kernel/kernel
    Checking for core dump on /dev/ar0s1b ...
    savecore: no dumps found
    Setting date via ntp.
    Looking for host 192.168.210.1 and service ntp
    host found : free.bsd1.net
     1 Nov 15:56:45 ntpdate[312]: step time server 192.168.210.1 offset 1.115163
    sec
    ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/X11R6/lib
    /usr/local/lib
    a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout /usr/X11R6/lib/aout
    Starting usbd.
    Starting local daemons:.
    Updating motd.
    Configuring syscons: blanktime.
    Starting sshd.
    Initial i386 initialization:.
    Additional ABI support:.
    Starting cron.
    Local package initialization:.
    Additional TCP options:.
    Starting background file system checks in 60 seconds.

    Mon Nov 1 15:56:47 EST 2004
     FreeBSD/i386 (comm-server.support.bsd1.net) (ttyd0) login: root
    Password:
    Nov 1 15:56:52 comm-server login: ROOT LOGIN (root) ON ttyd0 Last login:
    Mon Nov 1 15:10:59 on ttyd0
    Copyright (c) 1992-2004 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.

    FreeBSD 5.3-RC2 (DEBUG) #0: Mon Nov 1 14:48:42 EST 2004

    Welcome to FreeBSD!

    Before seeking technical support, please use the following resources:

    o Security advisories and updated errata information for all releases are
       at http://www.FreeBSD.org/releases/ - always consult the ERRATA section
       for your release first as it's updated frequently.

    o The Handbook and FAQ documents are at http://www.FreeBSD.org/ and,
       along with the mailing lists, can be searched by going to
       http://www.FreeBSD.org/search/. If the doc distribution has
       been installed, they're also available formatted in /usr/share/doc.

    If you still have a question or problem, please take the output of
    `uname -a', along with any relevant error messages, and email it
    as a question to the questions@FreeBSD.org mailing list. If you are
    unfamiliar with FreeBSD's directory layout, please refer to the hier(7)
    manual page. If you are not familiar with manual pages, type `man man'.

    You may also use sysinstall(8) to re-enter the installation and
    configuration utility. Edit /etc/motd to change this login announcement.

    erase ^H, kill ^U, intr ^C status ^T
    FreeBSD
    cons25
    ttyd0
    [comm-server.support.bsd1.net:ttyd0:/root ]> shutdown -r now Shutdown NOW!
    shutdown: [pid 497]

        *** FINAL System shutdown message from
    root@comm-server.support.bsd1.net *** System going down IMMEDIATELY
    Nov 1 15:56:58 comm-server shutdown: reboot by root:
    [comm-server.support.bsd1.net:ttyd0:/root ]> System shutdown time has
    arrived Shutting down daemon processes:.
    Stopping cron.
    Shutting down local daemons:.
    Writing entropy file:.
    .
    Nov 1 15:57:00 comm-server syslogd: exiting on signal 15 boot() called on
    cpu#1
    Waiting (max 60 seconds) for system process `vnlru' to stop...done
    Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
    Waiting (max 60 seconds) for system process `syncer' to stop...
    Syncing disks, vnodes remaining...4 4 2 2 0 0 0 done
    No buffers busy after final sync
    Uptime: 52s
    Waiting (max 60 seconds) for system process `hpt_wt' to stop...done
    Shutting down ACPI
    kk
    e
    rFnaetla lt rdaopu b1l2e wfiatuhl ti:n
    t
    eerirpu p=t s0 xdci1s9aabcl4ebdc
    esp = 0x6460c19a
    ebp = 0x0
    cpuid = 1; apic id = 01
    panic: double fault
    cpuid = 1
    KDB: enter: panic
    [thread 100002]
    Stopped at kdb_enter+0x2b: nop
    db> whre  ere
    kdb_enter(c08291f5) at kdb_enter+0x2b
    panic(c084267e,c08427ef,1,0,0) at panic+0x127
    dblfault_handler() at dblfault_handler+0x7a
    --- trap 0x17, eip = 0xc19ac4bc, esp = 0x6460c19a, ebp = 0 ---
    _end() at 0xc19ac4bc
    db> trace
    kdb_enter(c08291f5) at kdb_enter+0x2b
    panic(c084267e,c08427ef,1,0,0) at panic+0x127
    dblfault_handler() at dblfault_handler+0x7a
    --- trap 0x17, eip = 0xc19ac4bc, esp = 0x6460c19a, ebp = 0 ---
    _end() at 0xc19ac4bc
    db> call doae dump
    Dumping 510 MB
     16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320
    336 352 368 384 400 416 432 448 464 480 496
    Dump complete
    0xf
    db> reset
    cpu_reset called on cpu#1
    cpu_reset: Restarting BSP
    cpu_reset_proxy: Stopped CPU 1

    from kgdb:
    ============================================================================
    =================
    kgdb kernel.debug vmcore.0 [GDB will not be able
    to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol
    "ps_pglobal_lookup"]
    GNU gdb 6.1.1 [FreeBSD]
    Copyright 2004 Free Software Foundation, Inc.
    GDB is free software, covered by the GNU General Public License, and you are
    welcome to change it and/or distribute copies of it under certain
    conditions.
    Type "show copying" to see the conditions.
    There is absolutely no warranty for GDB. Type "show warranty" for details.
    This GDB was configured as "i386-marcel-freebsd".
    doadump () at pcpu.h:159
    (kgdb) whre  ere
    #0 doadump () at pcpu.h:159
    #1 0xc0460cd6 in db_fncall (dummy1=0, dummy2=0, dummy3=-1064198276,
        dummy4=0xc0919f64 "\230\237\221À\200%") at ../../../ddb/db_command.c:531
    #2 0xc0460ae4 in db_command (last_cmdp=0xc08c7a44, cmd_table=0x0,
        aux_cmd_tablep=0xc0848104, aux_cmd_tablep_end=0xc0848120)
        at ../../../ddb/db_command.c:349
    #3 0xc0460bac in db_command_loop () at ../../../ddb/db_command.c:455
    #4 0xc0462725 in db_trap (type=3, code=0) at ../../../ddb/db_main.c:221
    #5 0xc062adc7 in kdb_trap (type=3, code=0, tf=0x1)
        at ../../../kern/subr_kdb.c:418
    #6 0xc07c2f74 in trap (frame=
          {tf_fs = -1064239080, tf_es = -1067319280, tf_ds = -1065222128, tf_edi
    = -1065081218, tf_esi = 1, tf_ebp = -1064197916, tf_isp = -1064197936,
    tf_ebx = -1064197872, tf_edx = 0, tf_ecx = -1056882688, tf_eax = 18,
    tf_trapno = 3, tf_err = 0, tf_eip = -1067275477, tf_cs = 8, tf_eflags =
    16534, tf_esp = -1064197884, tf_ss = -1067371761}) at
    ../../../i386/i386/trap.c:576
    #7 0xc07b0d1a in calltrap () at ../../../i386/i386/exception.s:140
    #8 0xc0910018 in sc_buffer.5 ()
    #9 0xc0620010 in umtx_remove (uq=0xc091a110, td=0x0)
        at ../../../kern/kern_umtx.c:135
    #10 0xc061330f in panic (fmt=0xc084267e "double fault")
        at ../../../kern/kern_shutdown.c:537
    #11 0xc07c3566 in dblfault_handler () at ../../../i386/i386/trap.c:838
    #12 0x00000000 in ?? ()
    (kgdb) quit
    [comm-server.support.bsd1.net:ttyd0:/var/crash ]>

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    ++++++++++++++++++++

    The following commit seems to have cured the problem.

            Edit src/sys/kern/kern_shutdown.c
                      Add delta 1.163.2.3 2004.11.29.19.11.36 njl

    If this fix does, in fact, address this problem, can I expect to see it in
    an official patch to 5.3-RELEASE?

    This is the only issue keeping us from upgrading/deploying 5.3-RELEASE on
    all (twenty-two at last count) of our production servers. I can't get an
    agreement to deploy 5.3-STABLE from my management, so it's 5.3-RELEASE-px or
    wait until 5.4-RELEASE. I'd rather not wait.

    Thanks for any information that you can supply.

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: FreeBSD Tinderbox: "[current tinderbox] failure on i386/i386"