Re: panic after removing usb flash drive

From: Scott Long (scottl_at_samsco.org)
Date: 08/31/05

  • Next message: Brooks Davis: "Re: make distribution fails with RELENG_6 on a 5.4-RELEASE-p6 host"
    Date: Wed, 31 Aug 2005 09:38:20 -0600
    To: Ben Kaduk <minimarmot@gmail.com>
    
    

    Ben Kaduk wrote:
    > On 8/31/05, Kyle Brooks <captinsmock@columbus.rr.com> wrote:
    >
    >>umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2
    >>umass0: at uhub4 port 6 (addr 2) disconnected
    >>panic: vm_fault: fault on nofault entry, addr: deadc000
    >>
    >>kernel:
    >>
    >>FreeBSD 7.0-CURRENT #2: Mon Aug 29 00:39:21 UTC 2005
    >>
    >>problem:
    >>
    >>kernel panics when usb flash drive is removed
    >>
    >>backtrace:
    >>
    >>#0 doadump () at pcpu.h:165
    >>#1 0xc068610e in boot (howto=260)
    >>at /usr/src/sys/kern/kern_shutdown.c:397
    >>#2 0xc0685b92 in panic (
    >>fmt=0xc090e46c "vm_fault: fault on nofault entry, addr: %lx")
    >>at /usr/src/sys/kern/kern_shutdown.c:553
    >>#3 0xc0812de1 in vm_fault (map=0xc1060000, vaddr=3735928832,
    >>fault_type=2 '\002', fault_flags=0)
    >>at /usr/src/sys/vm/vm_fault.c:884
    >>#4 0xc0888807 in trap_pfault (frame=0xe6a06bf0, usermode=0,
    >>eva=3735929110)
    >>at /usr/src/sys/i386/i386/trap.c:741
    >>#5 0xc0888d04 in trap (frame=
    >>{tf_fs = 8, tf_es = -1063649240, tf_ds = 40, tf_edi = -993875968,
    >>tf_esi = -1014223872, tf_ebp = -425694000, tf_isp = -425694180, tf_ebx =
    >>-1063640044, tf_edx = -993875900, tf_ecx = 0, tf_eax = -559038242,
    >>tf_trapno = 12, tf_err = 2, tf_eip = -1069194040, tf_cs = 32, tf_eflags
    >>= 66050, tf_esp = -1063640032, tf_ss = 0})
    >>at /usr/src/sys/i386/i386/trap.c:442
    >>#6 0xc08745ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139
    >>#7 0x00000008 in ?? ()
    >>#8 0xc09a0028 in atdma_acpi_driver_mod ()
    >>#9 0x00000028 in ?? ()
    >>#10 0xc4c2a800 in ?? ()
    >>#11 0xc38c2c00 in ?? ()
    >>#12 0xe6a06cd0 in ?? ()
    >>#13 0xe6a06c1c in ?? ()
    >>---Type <return> to continue, or q <return> to quit---
    >>#14 0xc09a2414 in xsoftc ()
    >>#15 0xc4c2a844 in ?? ()
    >>#16 0x00000000 in ?? ()
    >>#17 0xdeadc0de in ?? ()
    >>#18 0x0000000c in ?? ()
    >>#19 0x00000002 in ?? ()
    >>#20 0xc04564c8 in camisr (V_queue=0xc09a2414)
    >>at /usr/src/sys/cam/cam_xpt.c:7066
    >>#21 0xc066f84e in ithread_loop (arg=0xc356fa80)
    >>at /usr/src/sys/kern/kern_intr.c:545
    >>#22 0xc066e808 in fork_exit (callout=0xc066f665 <ithread_loop>, arg=0x0,
    >>frame=0x0) at /usr/src/sys/kern/kern_fork.c:789
    >>#23 0xc087461c in fork_trampoline ()
    >>at /usr/src/sys/i386/i386/exception.s:208
    >>
    >
    > This is the expected behaviour

    Panics are not acceptable or expected behaviour in any situation, btw.

    > if you didn't unmount the filesystem on the
    > thumbdrive before removing it. There was some discussion on this a while ago
    > (but I don't seem to be able to find the exact posts), but the general idea
    > is that the kernel has no idea in what state the actual physical medium
    > (disc) is/was in after being pulled, and may have some stale buffers holding
    > data that got written to disk. It doesn't know what to do with this data, or
    > how to treat requests to that device, so it panics.
    >

    I probably missed the earlier discussion that you are referring to, but
    what you are saying here actually isn't true. There are a number of
    problems:

    1) When the thumbdrive gets pulled, the umass driver gets told to
    detach. It tries to detach itself from CAM, but things don't get torn
    down correctly because there is an open reference to the target in CAM
    (because there is a mounted filesystem on the device). umass truddles
    along anyways and goes away, leaving lots of dangling pointers in CAM
    that blow up on the next attempted I/O access.

    Part of the problem here is that the umass driver is architected wrong.
    It creates a SIM, bus, and target instance for every umass device that
    gets inserted. When the device gets pulled, it tries to tear down
    each of those instances all at once. CAM simply wasn't designed for
    this. It was designed for the SIMs and buses to be long-lived objects
    where only the targets (and luns) come and go. Making umass fit this
    model would invlove turning it into two logical drivers. One would be
    a SIM that would attach to the root hub instance of each USB controller
    and would treat the USB bus as a CAM bus. The other would be a target
    driver that gets created and destroyed on a per-device basis as those
    devices come and go. When a umass device gets plugged in, the USB
    framework would tell the apprpriate SIM to create a target instance.
    When the device gets pulled, the framework would tell the SIM to detach
    and destroy the target. No dangling pointers would be left behind by
    the SIM going away. I have some prototype work in progress on this.

    2) Some filesystems, UFS in particular, assume that an I/O will never
    fail. Instead of checking the error status of the buf on completion,
    they just continue on and assume that everything is fine. If the
    VM is trying to page in a vnode, for example, it'll think that
    the operation succeeded, and then really bad things will happen. I'm
    not sure if the same problem exists in MSDOSFS because I don't have
    any DOS filesystems except on USB, and the problem with umass stands
    in the way of further testing. In luei of fixing umass, I might have to
    create a synthetic md device to hold a msdos filesystem so that I can
    test how it behaves.

    3) It's unknown if the VM system knows how to rationally deal with
    failed I/O or how to propagate that kind of failure to the rest of the
    kernel and/or applications. What happens if you mmap a file, and then
    the device holding the file goes away? How do you let the application
    know that its mmap is now invalid? Send it a Sig11, maybe? How should
    the vnode pager deal with failure? There are lots of interesting
    problems here.

    In any case, the panic posted in the grandparent message implicates CAM
    and umass, which is what I would expect. There may be more layers of
    problems underneath it.

    Scott
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Brooks Davis: "Re: make distribution fails with RELENG_6 on a 5.4-RELEASE-p6 host"

    Relevant Pages

    • Re: panic after removing usb flash drive
      ... > You could probably get around the problem of sleeping in the USB event ... have to delete a SIM in case an USB-controller is detached. ... >>possible to have multiple umass interfaces on the same USB device. ... > but it should work fine with 2048 targets. ...
      (freebsd-current)
    • Re: panic after removing usb flash drive
      ... > Bugs in the umass detach code are immediately responsible for the ... Sample code on how to correctly detach a sim a sparse. ... This is because we currently have a single thread per bus processing ... but it won't work for USB to SCSI ...
      (freebsd-current)
    • Re: USB problems
      ... But If I boot with plugged USB hub ... But If I boot with turned off USB dongle - it panics, ... port 1 powered ...
      (freebsd-current)
    • Re: panic after removing usb flash drive
      ... When the thumbdrive gets pulled, the umass driver gets told to ... > down correctly because there is an open reference to the target in CAM ... > a SIM that would attach to the root hub instance of each USB controller ...
      (freebsd-current)
    • Re: panic after removing usb flash drive
      ... >>1) When the thumbdrive gets pulled, the umass driver gets told to ... >>down correctly because there is an open reference to the target in CAM ... >>and would treat the USB bus as a CAM bus. ...
      (freebsd-current)