memory corruption/panic solved ("FAILURE - ATAPI_IDENTIFY no interrupt")

From: Nate Lawson (nate_at_root.org)
Date: 07/31/04

  • Next message: Pawel Worach: "Re: gcc 3.4.2 and squid3"
    Date: Fri, 30 Jul 2004 15:48:52 -0700
    To: sos@deepcore.dk
    
    
    

    I've tracked down the source of the memory corruption in -current that
    results when booting with various CD and DVD drives (especially the ones
    that come with Thinkpads including T23, R32, T41, etc.) The panic is
    obvious when running with INVARIANTS ("memory modified after free") but
    not so obvious in other configurations. For instance, without
    INVARIANTS, part of the rt_info structure is corrupted on my wireless
    card, resulting in a panic during ifconfig on boot. This is likely the
    source of other problems, including phk's ACPI panic (again, only
    triggered when booting with the CD drive in the bay.)

    The root problem is that ata_timeout() fires and calls ata_pio_read()
    which overwrites 512 bytes random memory. There are actually two bugs
    here that overwrite memory. The code path is as follows:

    1. ata runs an IDENTIFY command on each drive. It reaches this stack:

    ata_getparam()
    ata_identify_devices()
    ata_boot_attach()

    2. ata_getparam() allocates a request and runs it:
    ata_alloc_request()
    loop on retries (2 max)
    fill out an immediate read request for 512 bytes (DEV_BSIZE)
    *** Bug 1: transfersize is 512 bytes but sizeof(struct ata_request) is
    much less (~80 bytes).
    ata_queue_request() starts the request and arms a timeout

    3. Completion for first request results in FAILURE - no interrupt.
    ata_completed() calls ata_generic_interrupt() which calls ata_pio_read()
    *** Bug 1: here's where the 512 - sizeof(struct ata_request) bytes
    following "request" are overwritten.

    . The completion code also updates request->donecount (an
    upward-counting residual) from 0 to 512 after the request has been
    completed.

    4. ata_getparam runs through the loop again and starts another identify
    *** Bug 2: donecount is now 512 so ata_pio_read() will write on
    request->data + 512 with 512 bytes. Whatever follows request->data +
    512 is overwritten.

    I've tested the attached patch and it fixes the memory overwrite (tested
    with hardware watchpoints) and obviously, the panics. It resets
    request->donecount in each loop iteration (bug 2). It also mallocs
    DEV_BSIZE bytes for the transfer (bug 1). An alternate fix for bug 1 is
    to lower the transfer size to sizeof(struct ata_request) but there may
    be some minimum transfer length requirements that force you to use
    DEV_BSIZE.

    Commit approval requested.

    -Nate

    
    

    Index: ata-all.c
    ===================================================================
    RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v
    retrieving revision 1.215
    diff -u -r1.215 ata-all.c
    --- src/sys/dev/ata/ata-all.c 12 Jul 2004 10:50:49 -0000 1.215
    +++ src/sys/dev/ata/ata-all.c 30 Jul 2004 22:30:55 -0000
    @@ -549,7 +549,7 @@
         if (!atadev->param)
             atadev->param = malloc(sizeof(struct ata_params), M_ATA, M_NOWAIT);
         if (atadev->param) {
    - request = ata_alloc_request();
    + request = malloc(DEV_BSIZE, M_ATA, M_NOWAIT | M_ZERO);
             if (request) {
                 int retries = 2;
                 while (retries-- > 0) {
    @@ -561,11 +561,12 @@
                     request->data = (caddr_t)atadev->param;
                     request->bytecount = sizeof(struct ata_params);
                     request->transfersize = DEV_BSIZE;
    + request->donecount = 0;
                     ata_queue_request(request);
                     if (!(error = request->result))
                         break;
                 }
    - ata_free_request(request);
    + free(request, M_ATA);
             }
             if (!error && (isprint(atadev->param->model[0]) ||
                            isprint(atadev->param->model[1]))) {

    
    

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Pawel Worach: "Re: gcc 3.4.2 and squid3"

    Relevant Pages