Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz <victor@xxxxxxxxx>
- Date: Thu, 11 Dec 2008 10:50:21 +0100
On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:
On Thu, Dec 11, 2008 at 09:10:45AM +0100, Victor Balada Diaz wrote:
> On Thu, Dec 11, 2008 at 08:57:07AM +0100, Victor Balada Diaz wrote:
> > On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
> > > On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
> > > > Also i didn't see any problem with interfaces going up and down,
> > > > but that usually happen after some hours of uptime, so i'll let
> > > > you know if the error happens again.
> > > >
> >
> > After writing to the HD with dd for a few hours and using
> > stress -i 10 -d 10 the machine lost connectivity. I waited until
> > today to be sure if the machine hung, paniced or just lost network
> > connectivity. I don't have local access or serial access, so this
> > is the only way i could do it. I've seen in the logs during the
> > night various messages of:
> >
> >
> > Dec 10 00:33:49 yac kernel: re0: watchdog timeout
> > Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
> > Dec 10 00:33:52 yac kernel: re0: link state changed to UP
> >
> > The interface never recovered and i wasn't able to ping the machine
> > until i rebooted. Nagios was checking all the time and no recovery
> > happened.
> >
> > The netstat -i in daily scripts shows just one Oerrs. I'm used to
> > have a lot of them, but seems this time the card didn't recover from
> > the only one. I also want to say that this is not a regression, as
> > it happened before with 7.1 -BETA 2 code.
> >
> > Is there anything more i can try?
>
> Sorry it's too early in the morning and i thought today was 10
> instead of 11. I don't even know the day i'm today.
>
> Looking at today's log i see no link state changed messages
> but i see this other messages that started happening more or
> less at the same time i lost connectivity to the server:
>
> Dec 10 18:20:32 yac kernel: re0: link state changed to DOWN
> Dec 10 18:20:32 yac kernel: re0: PHY read failed
>
I've reverted r185756 which caused GMII access issues on some
controllers. If you are brave enough to try beta code, you can
get latest re(4) in the following URL. Note, I don't have PCIe
based RealTek controllers so the code was not tested at all.
http://people.freebsd.org/~yongari/re/if_re.c
http://people.freebsd.org/~yongari/re/if_rlreg.h
I've recompiled the kernel with the first file in sys/dev/re/
and the second one in sys/pci/. I'm still testing with MSI enabled.
So far tried rebooting using nextboot(8) (just in case i lost the
network card i could boot again) and the card seems to work
but i'll continue stress testing the machine with stress + dd +
iperf and see if i can take it down. I'll let you know how it goes.
Regards.
--
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros.
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- References:
- [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- From: Pyun YongHyeon
- Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- From: Pyun YongHyeon
- Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- From: Pyun YongHyeon
- Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- From: Victor Balada Diaz
- Re: [ATA] and re(4) stability issues
- From: Pyun YongHyeon
- [ATA] and re(4) stability issues
- Prev by Date: Re: zfs panics
- Next by Date: Re: zfs panics
- Previous by thread: Re: [ATA] and re(4) stability issues
- Next by thread: Re: [ATA] and re(4) stability issues
- Index(es):
Relevant Pages
|