Re: [ATA] and re(4) stability issues



On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:
On Thu, Dec 11, 2008 at 09:10:45AM +0100, Victor Balada Diaz wrote:
> On Thu, Dec 11, 2008 at 08:57:07AM +0100, Victor Balada Diaz wrote:
> > On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
> > > On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
> > > > Also i didn't see any problem with interfaces going up and down,
> > > > but that usually happen after some hours of uptime, so i'll let
> > > > you know if the error happens again.
> > > >
> >
> > After writing to the HD with dd for a few hours and using
> > stress -i 10 -d 10 the machine lost connectivity. I waited until
> > today to be sure if the machine hung, paniced or just lost network
> > connectivity. I don't have local access or serial access, so this
> > is the only way i could do it. I've seen in the logs during the
> > night various messages of:
> >
> >
> > Dec 10 00:33:49 yac kernel: re0: watchdog timeout
> > Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
> > Dec 10 00:33:52 yac kernel: re0: link state changed to UP
> >
> > The interface never recovered and i wasn't able to ping the machine
> > until i rebooted. Nagios was checking all the time and no recovery
> > happened.
> >
> > The netstat -i in daily scripts shows just one Oerrs. I'm used to
> > have a lot of them, but seems this time the card didn't recover from
> > the only one. I also want to say that this is not a regression, as
> > it happened before with 7.1 -BETA 2 code.
> >
> > Is there anything more i can try?
>
> Sorry it's too early in the morning and i thought today was 10
> instead of 11. I don't even know the day i'm today.
>
> Looking at today's log i see no link state changed messages
> but i see this other messages that started happening more or
> less at the same time i lost connectivity to the server:
>
> Dec 10 18:20:32 yac kernel: re0: link state changed to DOWN
> Dec 10 18:20:32 yac kernel: re0: PHY read failed
>

I've reverted r185756 which caused GMII access issues on some
controllers. If you are brave enough to try beta code, you can
get latest re(4) in the following URL. Note, I don't have PCIe
based RealTek controllers so the code was not tested at all.

http://people.freebsd.org/~yongari/re/if_re.c
http://people.freebsd.org/~yongari/re/if_rlreg.h

I've recompiled the kernel with the first file in sys/dev/re/
and the second one in sys/pci/. I'm still testing with MSI enabled.

So far tried rebooting using nextboot(8) (just in case i lost the
network card i could boot again) and the card seems to work
but i'll continue stress testing the machine with stress + dd +
iperf and see if i can take it down. I'll let you know how it goes.

Regards.

--
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros.
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: repairing corrupt mp4 (and jpeg) video files
    ... and you can often squeeze a few extra pictures and some short video ... is *not* to write any further files on the card. ... successful file recovery is good, but not guaranteed, unfortunately, ... as some individual files may be corrupt even though the file system is ...
    (rec.video.desktop)
  • Re: repairing corrupt mp4 (and jpeg) video files
    ... and you can often squeeze a few extra pictures and some short video ... is *not* to write any further files on the card. ... successful file recovery is good, but not guaranteed, unfortunately, ... as some individual files may be corrupt even though the file system is ...
    (rec.video.desktop)
  • Re: repairing corrupt mp4 (and jpeg) video files
    ... low battery, but this is usually not a problem, just use it sparingly and you can often squeeze a few extra pictures and some short video clips from it. ... The others dont show anything, same with the mp4 files, even though the files are there and the sizes look ok. ... is *not* to write any further files on the card. ... successful file recovery is good, but not guaranteed, unfortunately, ...
    (rec.video.desktop)
  • Whats dying here?
    ... At time of recovery, ... Dumping Card State while idle, ... Disconnected Queue entries: ... Sequencer SCB Info: ...
    (Linux-Kernel)
  • Have you ever wished you could go back in time ?
    ... With the HD Recovery Card you can restore your data, ... install software or devices with no incompatibility fears; ...
    (uk.adverts.computer)