Re: A little story of failed raid5 (3ware 8000 series)




----- "David Schwartz" <davids@xxxxxxxxxxxxx> wrote:

It is supposed to be
for detecting data corruption, so if the card isn't using the
checksum, its kinda of useless.

You are confused. Checking for data corruption is done, by checking if
the *DATA* is corrupt. This does not require looking at the RAID5
checksum since the data has its own data checksum.

No, not really. You are just referring to parity as checksums. They are different.

Many RAID systems have checksums in addition to parity. For example, Netapp ZCS disks.

...

However, in this particular case, validating checksums would
have been unhelpful, since the disk was unreadable. diskcheckd
would have detected this issue. It would probably have prevented
the problem, if it had been running previously.

No, it would have saved him. The problem was he lost a drive, and
checksums *ON* *OTHER* *DRIVES* were unreadable. Quite possibly they
had been unreabable for months, but were never checked, since they are
only *needed* to reconstruct the data.

Which is what I said? The data on the other disks is unreadable. It doesn't really matter what parity or data was on those sectors. Yes, diskcheckd would only read data sectors.

ZFS is also a good option. It has file level checksumming.
ZFS never trusts the disks, and is super paranoid. And ZFS can
do background scrubbing too. I can't wait for ZFS in FreeBSD 7,
because ZFS in software is going to 10 x better than anything 3ware
has.

That wouuld not have helped him one bit. When the drive failed, the
RAID 5 checksums on the other drives still would not have been
scrubbed. The RAID 5 checksum (technically an XOR) is only needed to
recover the RAID 5 array if a drive (or sector) fails.

Ok, you should probably not refer to RAID5 parity as "checksums". They are different. Some RAID systems have both. And some do not.

ZFS checksums the file level data, which is independent from any RAID5 parity. And yes a media scan would have helped.


Tom
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • FreeBSD + ZFS on a production server?
    ... I plan to install a web server for production use and ZFS looks very ... almost falls off because it lacks support for ... is there any alternative filesystem that offers checksums on ... filesystem that at the same time plays well with software RAID (RAID-1 ...
    (freebsd-questions)
  • Re: Use of error correcting codes for file repair
    ... correction built in ... containing the xor or the last 2 xor'ed blocks. ... DDPDDPX, where D is data, P is parity, and X is xor'ed parity (allowing ... possibly every so often storing a block full of checksums (these could be ...
    (comp.compression)
  • Re: Use of error correcting codes for file repair
    ... containing the xor or the last 2 xor'ed blocks. ... DDPDDPX, where D is data, P is parity, and X is xor'ed parity (allowing ... recovery if both 1 data and a parity block are bad for the same set of 3, ... possibly every so often storing a block full of checksums (these could be ...
    (comp.compression)
  • Re: the " official point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
    ... Disks, at least, would be protected by RAID. ... This is why ZFS offers block checksums... ... much of what is classically separate layers into one part. ...
    (Linux-Kernel)
  • Re: Application of parity flag
    ... I guess that creating checksums is the main ... purpose for the parity flag. ... There is another useage for the parity flag, ... FPU comparations set PF whenever the ...
    (comp.lang.asm.x86)