RE: Yet another RAID Question (YARQ)

From: Ted Mittelstaedt (tedm_at_toybox.placo.com)
Date: 06/23/05

  • Next message: Dmitry Mityugov: "Re: ccd usage"
    To: "Sandy Rutherford" <sandy@krvarr.bc.ca>
    Date: Thu, 23 Jun 2005 02:20:09 -0700
    
    

    >-----Original Message-----
    >From: owner-freebsd-questions@freebsd.org
    >[mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Sandy
    >Rutherford
    >Sent: Thursday, June 23, 2005 1:15 AM
    >To: Ted Mittelstaedt
    >Cc: freebsd-questions@freebsd.org
    >Subject: RE: Yet another RAID Question (YARQ)
    >
    >
    >>>>>> On Wed, 22 Jun 2005 23:37:20 -0700,
    >>>>>> "Ted Mittelstaedt" <tedm@toybox.placo.com> said:
    >
    > > Seagate wrote a paper on this titled:
    >
    > > "Seagate Technology Paper 338.1 Estimating Drive Reliability in
    > > Desktop Computers and Consumer Electronic Systems"
    >
    > > that explains how they define MTBF. Basically, they define MTBF as
    > > what percentage of disks will fail in the FIRST year.
    >
    >Is this in the public domain? I wouldn't mind having a look at it.
    >

    I don't think it is but you can find ANYTHING on the Internet no
    matter how embarassing or private:

    http://www.digit-life.com/articles/storagereliability/

    >
    > > Ain't statistics grand? You can make them say anything!
    >For an encore
    > > Seagate went on to prove that their CEO would live 3 centuries
    > > by statistical grouping. :-)
    >
    >Now don't knock statistics. The problem does not lie with statistics,
    >but with its misuse by people who do not understand what they are
    >doing. No, I am not a statistician; however, I am a mathematician.
    >

    Then I am expecting you to read Seagates paper and after laughing your
    ass off, post a review of it here. :-)

    > > So, in getting back to the gist of what I was saying, the issue is
    > > as you mentioned standard deviation. I think we all understand that
    > > in a disk drive assembly line that it's all robotic, and that there
    > > is an extremely high chance that disk drives that are within a few
    > > serial numbers of each other are going to have virtually identical
    > > characteristics. In fact I would say using the Seagate MTBF
    >definition,
    > > that 1 in every 160 drives manufactured in a particular run is going
    > > to have a significant enough deviation to fail at a significantly
    > > different
    > > period of time, given identical workload.
    >
    >I am not so sure. If we were talking about can openers, I would
    >agree. However, a disk drive is basically a mechanical object which
    >performs huge numbers of mechanical actions over the course of a
    >number of years. Even extremely minute variations in the
    >physical characteristics of the materials could lead to substantive
    >variations over time. However, the operative word here is "could".
    >Real data is required. I tried to google for a relevant study, but
    >came up empty. This surprised me as it seems like the sort of thing
    >that masses of data should have been collected for.
    >

    I'm sure they are but it's all going to be useful to the competitors
    so I doubt the companies that collected the data will let it out.

    What your asking for are nothing less than the recipie for setting
    costs levels to make a disk drive assembly line profitable - and that
    is an assembly line that even at the best of it, operates with a razor
    thin margin.

    Getting back to the physical characteristics, yes I had thought of
    that too and it is a consideration on reliability. However, the
    speed and tolerances of these things is so tight that any significant
    manufacturing deviation from the design is going to have the effect
    of seriously shortening lifetime.

    Consider also the typical automobile engine - by comparison to
    drive manufacturing the allowable variations are huge - yet for
    most cars, the engines all fail around the 200,000 mile mark.

    I think manufacturing deviations effects are staggered - during the
    first year they matter the most, then in successive years they
    don't matter much.

    Ted
    _______________________________________________
    freebsd-questions@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"


  • Next message: Dmitry Mityugov: "Re: ccd usage"

    Relevant Pages

    • RE: Yet another RAID Question (YARQ)
      ... and the more head movement the quicker the disk wears out. ... lights on all the disk drives all the time. ... >deviation data quoted by a manufacturer, which of course makes any MTBF ...
      (freebsd-questions)
    • RE: Yet another RAID Question (YARQ)
      ... > what percentage of disks will fail in the FIRST year. ... > in a disk drive assembly line that it's all robotic, ... > is an extremely high chance that disk drives that are within a few ... It's almost certainly the PCI bus. ...
      (freebsd-questions)
    • Re: File Merge
      ... You cannot know what happened until it actually happens -- so the only way to know that you have reached end-of-input is to try to read something and get a failure. ... might fail: for example, input from a disk could fail in the event of a head crash, or input from a keyboard could fail if you spilled Coke Classic into the mechanism and shorted it out with caramelized sugar. ... Most of C's input functions report a kind of "generalized failure" no matter what the cause -- and the *only* reason feof() exists is to let you figure out that cause. ...
      (comp.lang.c)
    • Re: Signed and unsigned int
      ... is not guaranteed that a signed int has more than 16 bits, ... file size or disk partition size. ... Disk drives nowadays are much smarter than the old ... The number goes up if your DVD supports double-sided ...
      (comp.lang.c)
    • Re: ext2/3: document conditions when reliable operation is possible
      ... +during write, filesystems can't handle that correctly, because success ... and all we can do is fail. ... And here we're talking about ext2. ... Sounds like broken disk, then. ...
      (Linux-Kernel)