Re: file system (UFS2) consistancy after -current crash? (fwd)

From: Kirk McKusick (mckusick_at_beastie.mckusick.com)
Date: 10/04/03

  • Next message: Barney Wolff: "[security-advisories@freebsd.org: [FreeBSD-Announce] FreeBSD Security Advisory FreeBSD-SA-03:17.procfs]"
    To: Aaron Wohl <freebsd@soith.com>
    Date: Fri, 03 Oct 2003 16:29:10 -0700
    
    

            Date: Fri, 03 Oct 2003 05:03:34 -0600
            From: Aaron Wohl <freebsd@soith.com>
            To: freebsd-current@freebsd.org
            Subject: file system (UFS2) consistancy after -current crash?

            After crashes recently ive been geting softupdate inconsistancies.
            Directories in which a file has recently been renamed have neither
            the old file nor the new file. fsck -y recovers the inode and drops
            it in lost in found.

            I was under the impression that atomic rename() synced all the way
            to the disk before returning?

            Does softupdate enabled/disable have any bearing on this?

            The disks themselfs are a raid5 on an adaptec 5400s. We have had
            some problems recently with aac (the 5400s driver) related crashes
            we have been working with Scott Long on. I was wondering if maybe
            rename is only syncing as far as the raid controller memory?

    The problem that we have been having with many of the RAID
    systems is that they give an I/O completion interrupt after
    they copy the change into their memeory, but before the I/O
    is completed to the disk. Since the filesystem uses the I/O
    completion interrupt as an indication that the change is on
    disk, it proceeds to the next step. If the RAID ultimately
    fails to get the data to the disk, inconsistencies arise.
    This problem can arise whether or not soft updates are being
    used, but because soft updates makes individual changes over
    a longer time period (potentially up to a minute rather than
    the few milliseconds of 2-3 synchronous writes), it is more
    likely to be apparent after a crash. None of this helped by
    a journalling filesystem as the RAID lies about writing the
    log so you may not have it available to do a rollback after
    a crash. As we discovered with IDE disks, disabling the "write
    cache enable" feature causes a massive performance hit, so in
    practice that does not seem like a viable strategy. What does
    work is to use tag-queueing. Unfortunately tag-queueing is
    found primarily in SCSI systems, though it is starting to
    show up in the high-end IDE disks.

            Kirk McKusick
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Barney Wolff: "[security-advisories@freebsd.org: [FreeBSD-Announce] FreeBSD Security Advisory FreeBSD-SA-03:17.procfs]"

    Relevant Pages

    • Filegroups and multi-processor threading improvements
      ... I have read extensively on filegroups vs. RAID 5, ... threads that SQL 2000 will use per filegroup. ... I/O will be limited to one CPU thread. ... Granted disk I/O ...
      (microsoft.public.sqlserver.server)
    • Re: SCSI vs S-ATA
      ... > How does SATA perform relative to SCSI in a multi-user environment? ... SCSI HD may queue up a number of disk I/O requests onboard and choose the ... The host CPU issue is relevant for RAID. ...
      (microsoft.public.windows.server.sbs)
    • Re: How analyze the system bottleneck using shell tools
      ... and your disk I/O would drop. ... perhaps your system is I/O limited. ... You mean the benchmark tool can give the ... A RAID is a Redundant Array of Inexpensive Disks. ...
      (comp.unix.shell)
    • Re: disk failure
      ... if i run a wrkdsksts all 8 drives show activity. ... i/o. ... so it seems that a disk has failed yet all 8 show activity. ... raid protection stopped too, so the raid status shows that 5 disk are running ...
      (comp.sys.ibm.as400.misc)
    • slackware 9.1 software raid problem
      ... Setting up a RAID system with Slackware 8 is not extremely difficult once ... mirroring the root partition and booting from that mirror was not possible. ... Each disk is attached to a different IDE chain on the motherboard. ... The ability to boot from the Slackware 8 install CD. ...
      (alt.os.linux)