Re: Western Digital hard disks and ATA timeouts



As i'm writing this i'm trying to rescue the contents of another computers disk.

Something about the seek heads or something related to that is
physically half-broken so the disk might need up to 10 retries just to
read a sector, once read however it's usually no problem. I'm using
myrescue (running on 6.2 so i don't know if it's included in the
current ports but if anyone wants to run it on freebsd i've done the
"gruntwork" for porting) so it's not a really big issue with all the
timeouts as it'll try to read that sector again later, but had i had
the sysctl i would've been a tad happier right now.

As for the defaults being a small value i personally think it's better
to throw out some messages/errors early on before the disk reaches a
catastrophic state (Atleast on 6.2 the kernel will put out a message
for each retry without giving faults, maybe more retries before
throwing an error maybe?).

By catastrpohic state i'm refering to that oh-so-famous google paper
that did say that once a disk has started showing errors it doesn't
have long to live, but i do trust that conclusion as i've been
"warned" by these messages 2 times but ignored them until the disk
went really bad.

The main thing i'm trying to get through is that early warning and
small problems are helluva lot better than big disasters. Thing of it
like the oil meter on your car, it's not like you're gonna go out and
drive 100s of km's in the wilderness if you know that the car is in a
bad state. (Now if only smart info was reliable!)

/ Jonas

2008/11/7 Peter Wemm <peter@xxxxxxxx>:
On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick <koitsu@xxxxxxxxxxx> wrote:
[..]
As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and
is not adjustable without editing the ATA code yourself and increasing
the value. The FreeNAS folks have made patches available to turn the
timeout value into a sysctl.

Soren and/or others, please increase this timeout value. Five seconds
has now been deemed too aggressive a default. And please consider
migrating the timeout value into a sysctl.

The 5 second timeout has been a problem for quite a while actually.
I've had a number of instances where I've had to increase it to 20 or
30 seconds when recovering from marginal drives. The longest
"successful" recovery attempt I've seen was 26 seconds, I believe on a
Maxtor drive a few years ago. ("successful" == the drive spent 26
seconds but eventually successfully read the sector). Even the IBM
death star drives could take much longer than 5 seconds to do a
recovery 5 years ago. 5 seconds has never been a good default.

I think the timeout should be increased to at least 30 seconds. My
windows box has a timeout that goes for several minutes.

If there is concern about FreeBSD appearing to hang, I could imagine
that a console warning message could be printed after 5 seconds. But
just say "drive has not yet responded". But give it more time.

In this day and age we're generally not playing games with udma33 vs
66, notched cables, poor CRC support etc. SATA seems to have
eliminated all that. Hmm, it might make sense to increase the timeout
on SATA connections to 2 or 3 minutes by default.
--
Peter Wemm - peter@xxxxxxxx; peter@xxxxxxxxxxx; peter@xxxxxxxxxxxxx; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
_______________________________________________
freebsd-hardware@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hardware
To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@xxxxxxxxxxx"

_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Unmountable boot volume error - XP DISC?
    ... XP recovery disk, they are sending that, along with all the software disks ... 500 gig drives are under $100. ... it at all if you were running Windows at the time as it's a system file. ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Disk Crash
    ... recovery disk as HP only puts the info on a recovery partition. ... I had her send me the disk. ... I see files and folders, ... New drives are cheap and unless your time is worthless it's cheaper to ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Recover the system
    ... I am on my 5th hard disk and I have never done a recovery. ... slave might have been the wrong word... ... What you might not be able to copy though is the recovery partition. ... Manufacturer's of drives nearly all have a cloning program available ...
    (microsoft.public.windowsxp.general)
  • Re: FIXMBR redux
    ... >> You can also delete the cmldr file that is added by the recovery ... the file system needed to read the image fileset to perform a physical ... A *clone* of a disk is what you were thinking of and will do a ... prefer NOT to use hard drives because they are mechanical devices. ...
    (microsoft.public.windowsxp.basics)
  • Re: Hard disk data recovery.
    ... the best way to attempt data recovery. ... unformatted disk. ... Mount any partition you want to recover as read-only, ... as well as tools related to S.M.A.R.T capable hard drives. ...
    (comp.os.linux.misc)