Re: nfs-server silent data corruption



Kris Kennaway <kris@xxxxxxxxxxx> writes:

On Mon, Apr 21, 2008 at 01:02:33AM +0200, Arno J. Klaassen wrote:

I didn't stress-test this MB for a while, but last time I did was
with 7-PRELEASE/RC?/CANTremember-exactly-but-close-to-release
and all worked great

I did add 2G ECC to the 2nd CPU since, though I doubt that interferes
with NFS.

Uh, you're getting server-side data corruption, it could definitely be
because of the memory you added.

yop, though I'm still not convinced the memory is bad (the very same
Kingston ECC as the 2*1G in use for about half a year already) :

I added it directly to the 2nd CPU (diagram on page 9 of
http://www.tyan.com/manuals/m_s2895_101.pdf) and the problem
seems to be the interaction between nfe0 and powerd .... :

- if I stop powerd, problems go away
- I let run powerd but turn of txcsum and tso4 on the interface,
the problem is a lot harder to produce (if ever this gives
a hint to anyone)

Device is :

nfe0@pci0:0:10:0: class=0x068000 card=0x289510f1 chip=0x005710de rev=0xa3 hdr=0x00
vendor = 'Nvidia Corp'
device = 'nForce4 Ultra NVidia Network Bus Enumerator'
class = bridge
cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0

(this is with the default BIOS setting " LAN Bridge Enabled", disabling
that setting makes pciconf say "class = network" but does not influence
my problem)

I will restart my tests now by populating all 4G to only CPU1 and
say whether that matters.

Best, Arno

_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: nfs-server silent data corruption
    ... I did add 2G ECC to the 2nd CPU since, though I doubt that interferes ... seems to be the interaction between nfe0 and powerd ...
    (freebsd-stable)
  • Re: Whats magic about 7.32 seconds?
    ... The purpose of memscrub is to be sure that all ... But current DIMMs don't detect and report this bad ECC ... I do know that Solaris cpu offline ...
    (comp.unix.solaris)
  • Re: Performance Difference
    ... CPU will be faster or very much faster, ... ECC RAM may be slower but have an undetectable small ...
    (microsoft.public.windowsxp.hardware)
  • Re: nfs-server silent data corruption
    ... I did add 2G ECC to the 2nd CPU since, though I doubt that interferes ... with NFS. ... Uh, you're getting server-side data corruption, it could definitely be ...
    (freebsd-stable)
  • Re: nfs-server silent data corruption
    ... I did add 2G ECC to the 2nd CPU since, though I doubt that interferes ... with NFS. ... Uh, you're getting server-side data corruption, it could definitely be ...
    (freebsd-net)