Re: suddenly filesystem becomes read-only?
- From: "Michael Paoli" <michael1cat@xxxxxxxxx>
- Date: 7 Oct 2006 11:18:49 -0700
Troy Piggins wrote:
* Logan Shaw wrote:
Troy Piggins wrote:
* Michael Paoli wrote:If this is in fact the reason the filesystem became read-only (which
may have options such as errors=remount-ro or may default to such
behavior. Some may have mount options such as errors=panic or may
it seems like it probably is), then you really need to do some
careful investigation, provided this machine is important to you.
Put it this way, my girlfriend thinks I love my linux box moreI think that's a common (natural?) occurrence between girlfriends and
than her...
LINUX ;-) ... but that's probably a topic for some other newsgroup.
Well, I'd also lump power distruptions into (2), or in some cases (1),At this point, there are only two possible causes:Yep, I'm thinking (2) also.
(1) software problem, 99% chance it's a bug in the filesystem code
(2) hardware problem, which means either defective memory (unlikely)
or dying hard disk.
depending upon their cause, and they could be completely external
to the system itself that's had the filesystem issue - nevertheless
such could introduce logical data corruption and/or hardware damage
to the disk, or otherwise have negative impacts upon the computer
system.
If I had to bet money, I'd say this was most likely due to a defective
hard disk. So, I would check your disk with whatever tool Linux provides
for checking for bad sectors. You could even just do something like a
"dd if=/dev/hda1 of=/dev/null bs=1024k" just to be sure you can read
all the sectors. Checking that you can read and write them both would
be a better test, though.
I'll do some checks on the weekend. Thanks for the pointers.
Yes, that's one of the first things I'd also do if I suspected there
might be disk hardware problems - read the entire device end-to-end,
and see if that is successful or not. Note also that more modern
and/or intelligent hard drive (e.g. SCSI, most non-ancient IDE/ATA,
etc.) drives are relatively intelligent about automagically "fixing"
minor hard drive problems. Within my experience, on SCSI this is
typically much more graceful than with IDE/ATA, but others may have
had different experiences (thus far I've not dealt with a particular
large number of drives that automagically recovered themselves, so
I'm working from a small number statistics sample set). With SCSI,
there's a "grown defects list" (or whatever its precise name is).
When bad/suspect sectors are found, they're added to this list. If
the drive is still able to read the data (if it's read, before there
is some attempt to write it), it will rewrite it elsewhere, and remap
so the alternate sector is used. If it's simply written to, it
likewise remaps it, and writes and henceforward uses the alternate
sector. Things only go rather to quite poorly with this scheme when
either the operating system still needs to read the data, and the
SCSI drive can't successfully read it, or the "grown defects list"
table overflows, and the SCSI drive can no longer remap bad sectors
(if you or your monitoring software can monitor the "grown defects
list", watching for growth there, particularly if it's growing fast,
or the table is approaching being full, those are strong indicators of
a disk that is quite probable to non-recoverablely fail in the near
future). With IDE/ATA I've seen similar, but less graceful behavior.
With such drives, it seems (I've not verified this at all, ... just
my guestimate on behaviors I've seen) the drives aren't as
"proactive" about remapping. It seems they only get around to
remapping after a sector has gotten to the point where it can't be
successfully read. On the other hand, SCSI seems a bit more
intelligent about this, and is often capable of detecting that
sectors are becoming "difficult" (perhaps close to tolerance limits,
or experiencing some read errors, but succeed with repeated read
retrys) to read, and often successfully remap them (with no visible
sign that any problems occurred, other than the growth of the defects
list, and perhaps a trace of extra latency in reads on some
occasions). Anyway, even with the "smart", but not *as* "smart"
IDE/ATA drives, overwriting the sector of the device that's having
the problem will often cause the problem to automagically
"disappear", as it gets remapped upon the overwrite (e.g. my personal
laptop has given this precise behavior exactly twice thus far in the
over 3 years that I've had this laptop). Note however, that for many
filesystem types, that overwriting the file that contains the bad
sector may not attempt to overwrite the bad sector - e.g. journaling
filesystems will typically write the data elsewhere upon "overwrite"
(so that an incomplete action - such as one disrupted by loss of
power or system lock-up, can be "rolled back" (or forward) to a
consistent filesystem state.
As was mentioned (or at least hinted at) earlier, if you're able to
do repeated overwrites of various patters, that's typically best at
testing/exercising a hard drive (particularly also with lots of
random seeks included - I've had drives that read (and wrote)
perfectly fine end-to end, but failed miserably under random seek
conditions) ... but "most of the time" (at least more often than
not), reading end-to-end (even in purely sequential manner) will
typically pick up problems a drives is having. Also, due to all the
automagic remapping stuff, overwrite tests can quickly and
effectively "hide" a problem (or make it go away, when successfully
remaped), and can make it less clear that there at least *was* a
problem ... hence I generally recommend at least doing full reads,
before trying overwrites (at least if one wants to check/confirm if
the drive is or has been having a problem).
Also, not sure about the latest protocols, standards, and tools, but
as far as I'm aware, the "grown defects list" can be inspected (e.g.
via software and SCSI protocols) on SCSI disks, but I don't think
such capability exists for ATA/IDE drives (but perhaps that's
changed?). Precise answers on that may also vary depending on OS
flavor and available software. You mentioned Ubuntu (which is Debian
based). Debian has tools for getting detailed information from SCSI
devices - including the "grown defects" list, so I'd think it
probable Ubuntu includes or makes available same, or similar tool.
I'm not as sure about ATA/IDE, but perhaps you or someone else will
provide us with more information (and any applicable corrections)
regarding such.
.
- References:
- suddenly filesystem becomes read-only?
- From: Troy Piggins
- Re: suddenly filesystem becomes read-only?
- From: Mark Hittinger
- Re: suddenly filesystem becomes read-only?
- From: Troy Piggins
- Re: suddenly filesystem becomes read-only?
- From: Michael Paoli
- Re: suddenly filesystem becomes read-only?
- From: Troy Piggins
- Re: suddenly filesystem becomes read-only?
- From: Logan Shaw
- Re: suddenly filesystem becomes read-only?
- From: Troy Piggins
- suddenly filesystem becomes read-only?
- Prev by Date: Re: suddenly filesystem becomes read-only?
- Next by Date: Re: sftp setup guide wanted !!
- Previous by thread: Re: suddenly filesystem becomes read-only?
- Next by thread: Re: suddenly filesystem becomes read-only?
- Index(es):
Relevant Pages
|
|