DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE

From: Karl Denninger (karl_at_denninger.net)
Date: 03/30/05

  • Next message: Doug White: "Re: "ffs_mountroot: can't find rootvp" after cvsup and making worldfmen"
    Date: Tue, 29 Mar 2005 20:08:41 -0600
    To: freebsd-stable@freebsd.org
    
    

    WARNING!

    FreeBSD 5.4-PRERELEASE #6: Tue Mar 29 12:44:22 CST 2005 karl@FS.denninger.net:/usr/obj/usr/src/sys/KSD-SMP

    Copyright (c) 1992-2005 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    FreeBSD 5.4-PRERELEASE #6: Tue Mar 29 12:44:22 CST 2005
        karl@FS.denninger.net:/usr/obj/usr/src/sys/KSD-SMP
    ACPI APIC Table: <DELL PE400SC>
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2394.01-MHz 686-class CPU)
      Origin = "GenuineIntel" Id = 0xf29 Stepping = 9
      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
      Hyperthreading: 2 logical CPUs
    real memory = 267862016 (255 MB)
    avail memory = 252456960 (240 MB)
    FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
     cpu0 (BSP): APIC ID: 0
     cpu1 (AP): APIC ID: 1
    ioapic0: Changing APIC ID to 2
    ioapic0 <Version 2.0> irqs 0-23 on motherboard
    npx0: <math processor> on motherboard
    npx0: INT 16 interface
    acpi0: <DELL PE400SC> on motherboard
    acpi0: Power Button (fixed)
    Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
    acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
    cpu0: <ACPI CPU> on acpi0
    cpu1: <ACPI CPU> on acpi0
    acpi_button0: <Power Button> on acpi0
    pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
    pci0: <ACPI PCI bus> on pcib0
    agp0: <Intel 82875P host to AGP bridge> mem 0xe8000000-0xefffffff at device 0.0 on pci0
    pcib1: <PCI-PCI bridge> at device 1.0 on pci0
    pci1: <PCI bus> on pcib1
    pci1: <display, VGA> at device 0.0 (no driver attached)
    uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xff80-0xff9f irq 16 at device 29.0 on pci0
    usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
    usb0: USB revision 1.0
    uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub0: 2 ports with 2 removable, self powered
    uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xff60-0xff7f irq 19 at device 29.1 on pci0
    usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
    usb1: USB revision 1.0
    uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub1: 2 ports with 2 removable, self powered
    uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xff40-0xff5f irq 18 at device 29.2 on pci0
    usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
    usb2: USB revision 1.0
    uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub2: 2 ports with 2 removable, self powered
    uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xff20-0xff3f irq 16 at device 29.3 on pci0
    usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
    usb3: USB revision 1.0
    uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub3: 2 ports with 2 removable, self powered
    pci0: <serial bus, USB> at device 29.7 (no driver attached)
    pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0
    pci2: <ACPI PCI bus> on pcib2
    atapci0: <SiI 3112 SATA150 controller> port 0xcd70-0xcd7f,0xcd5c-0xcd5f,0xcd68-0xcd6f,0xcd58-0xcd5b,0xcd60-0xcd67 mem 0xfe7dee00-0xfe7defff irq 21 at device 0.0 on pci2
    ata2: channel #0 on atapci0
    ata3: channel #1 on atapci0
    ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xce00-0xceff mem 0xfe7df000-0xfe7dffff irq 22 at device 1.0 on pci2
    aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs
    rp0: <RocketPort PCI> port 0xcd80-0xcdbf irq 17 at device 2.0 on pci2
    RocketPort0 (Version 3.02) 4 ports.
    pcib3: <PCI-PCI bridge> at device 3.0 on pci2
    pci3: <PCI bus> on pcib3
    fxp0: <Intel 82558 Pro/100 Ethernet> port 0xbf80-0xbf9f mem 0xfe400000-0xfe4fffff,0xf8001000-0xf8001fff irq 19 at device 4.0 on pci3
    miibus0: <MII bus> on fxp0
    inphy0: <i82555 10/100 media interface> on miibus0
    inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    fxp0: Ethernet address: 00:d0:b7:6f:ce:e8
    fxp1: <Intel 82558 Pro/100 Ethernet> port 0xbfe0-0xbfff mem 0xfe500000-0xfe5fffff,0xf8000000-0xf8000fff irq 18 at device 5.0 on pci3
    miibus1: <MII bus> on fxp1
    inphy1: <i82555 10/100 media interface> on miibus1
    inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    fxp1: Ethernet address: 00:d0:b7:6f:ce:e9
    em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xcdc0-0xcdff mem 0xfe7e0000-0xfe7fffff irq 18 at device 12.0 on pci2
    em0: Ethernet address: 00:0c:f1:c9:df:c5
    em0: Speed:N/A Duplex:N/A
    isab0: <PCI-ISA bridge> at device 31.0 on pci0
    isa0: <ISA bus> on isab0
    atapci1: <Intel ICH5 UDMA100 controller> port 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 18 at device 31.1 on pci0
    ata0: channel #0 on atapci1
    ata1: channel #1 on atapci1
    atapci2: <Intel ICH5 SATA150 controller> port 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 18 at device 31.2 on pci0
    ata4: channel #0 on atapci2
    ata5: channel #1 on atapci2
    pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
    pci0: <multimedia, audio> at device 31.5 (no driver attached)
    fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
    fd0: <1440-KB 3.5" drive> on fdc0 drive 0
    atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
    atkbd0: <AT Keyboard> irq 1 on atkbdc0
    kbd0 at atkbd0
    sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
    sio0: type 16550A
    sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
    sio1: type 16550A
    orm0: <ISA Option ROMs> at iomem 0xd1800-0xd3fff,0xd0000-0xd17ff,0xcf800-0xcffff,0xcb000-0xcf7ff,0xc0000-0xcafff on isa0
    pmtimer0 on isa0
    ppc0: parallel port not found.
    sc0: <System console> at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x300>
    vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    RTC BIOS diagnostic error 18<memory_size,fixed_disk>
    Timecounters tick every 10.000 msec
    ipfw2 initialized, divert enabled, rule-based forwarding disabled, default to deny, logging disabled
    acd0: CDROM <Lite-On LTN486S 48x Max/YDS6> at ata1-master UDMA33
    ad4: 238475MB <HDS722525VLSA80/V36OA63A> [484521/16/63] at ata2-master SATA150
    ad6: 239372MB <Maxtor 6B250S0/BANC1B70> [486344/16/63] at ata3-master SATA150
    em0: Link is up 100 Mbps Full Duplex
    ad8: 239372MB <Maxtor 6B250S0/BANC1980> [486344/16/63] at ata4-master SATA150
    ad10: 238475MB <HDS722525VLSA80/V36OA63A> [484521/16/63] at ata5-master SATA150
    Waiting 15 seconds for SCSI devices to settle
    GEOM_MIRROR: Device boot created (id=1131801609).
    GEOM_MIRROR: Device boot: provider ad8s1 detected.
    GEOM_MIRROR: Device boot: provider ad10s1 detected.
    GEOM_MIRROR: Device boot: provider ad10s1 activated.
    GEOM_MIRROR: Device boot: provider mirror/boot launched.
    GEOM_MIRROR: Device boot: rebuilding provider ad8s1.
    sa0 at ahc0 bus 0 target 2 lun 0
    sa0: <DEC DLT2000 15/30 GB 840B> Removable Sequential Access SCSI-2 device
    sa0: 5.000MB/s transfers (5.000MHz, offset 15)
    SMP: AP CPU #1 Launched!
    Mounting root from ufs:/dev/mirror/boota
    WARNING: /dbms was not properly dismounted
    WARNING: /disk was not properly dismounted
    WARNING: /archive was not properly dismounted
    em0: Link is up 100 Mbps Full Duplex

    Built this afternoon.

    This has a fix in the ATA code, to wit:

    mdodd 2005-03-23 04:50:26 UTC

      FreeBSD src repository

      Modified files: (Branch: RELENG_5)
        sys/dev/ata ata-queue.c
      Log:
      MFC

      1.42: When resubmitting a timed out request, reset donecount.
      1.41: Reset timeout when we are back from interrupt.
      1.40: Correct logical error, result was that retries wasn't always made but
            failure reported instead.
      1.39: Do not retry on requests that have lost their device during reinit.

      Approved by: re

      Revision Changes Path
      1.32.2.6 +10 -3 src/sys/dev/ata/ata-queue.c

    This change is EXTREMELY DANGEROUS.

    If your system takes a RECOVERABLE DMA Write Error (as is happening
    frequently on machines with SATA drives on the PCI bus!) you used to get a
    drive disconnect from the mirror.

    NOW you end up with a RADICALLY unstable machine. Specifically, interrupts
    now get seriously screwed up, serial I/O on the machine stops working
    immediately, and ultimately the machine crashes or hangs in VERY odd ways.

    This change needs to be backed out immediately until it can be determined
    why a requeued request destabilizes the system.

    I have removed the ad4 and ad6 drives here (which are the ones on the PCI
    bus) until this is addressed.

    --
    -- 
    Karl Denninger (karl@denninger.net) Internet Consultant & Kids Rights Activist
    http://www.denninger.net	My home on the net - links to everything I do!
    http://scubaforum.org		Your UNCENSORED place to talk about DIVING!
    http://www.spamcuda.net		SPAM FREE mailboxes - FREE FOR A LIMITED TIME!
    http://genesis3.blogspot.com	Musings Of A Sentient Mind
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
    

  • Next message: Doug White: "Re: "ffs_mountroot: can't find rootvp" after cvsup and making worldfmen"