Re: network failures

From: J.Salfer (JohnS_at_VoyageurSoftware.com)
Date: 02/21/05


Date: Mon, 21 Feb 2005 10:05:55 -0600


"Bill Vermillion" <bv@wjv.com> wrote in message news:IC70Jw.2001@wjv.com...
> In article <111eu9pat361993@corp.supernews.com>,
> news.nuvox.net <fabiog@venmar.com> wrote:
>>
>>"J.Salfer" <JohnS@VoyageurSoftware.com> wrote in message
>>news:421651af$1_1@newspeer2.tds.net...
>>> "Ronald J Marchand" <ron@rojomar.com> wrote in message
>>> news:f2e1c$42164df8$42a6716f$28713@msgid.meganewsservers.com...
>>>> "J.Salfer" <JohnS@VoyageurSoftware.com> wrote in message
>>>> news:42164622$1_1@newspeer2.tds.net...
>>>>> Hello,
>>>>>
>>>>> Recently our SCO OSR505 box has started having network problems where
>>>>> we
>>>>> loose all connections and cannot reconnect. The system is rebooted to
>>>>> get
>>>>> things working again.
>>>>>
>>>>> When netstat -m is viewed we have huge failures in class 6 and in this
>>>> case
>>>>> class 7 lines. We've increased the NSTRPAGES from 1400 to 3500. That
>>>> seemed
>>>>> to reduce the frequency, but the problem still comes up.
>>>>>
>>>>> We could keep increasing the value, but I'm wondering what all the
>>>>> sudden
>>>>> started causing the problem as the system has been stable for 3 years
>>>> prior
>>>>> to this coming up.
>>>>>
>>>>> Comments would be greatly appreciated.
>>>>>
>>>>> Thanks,
>>>>> John.
>>>>
>>>> Had a similar situation with 5.0.6 and a 3Com card. Worked well when
>>>> installed. Over time the number of connections and the traffic on the
>>>> system increased. Replaced the card with an Intel Pro 1000 and all has
>>>> been
>>>> well since.
>>>>
>>>> Ron
>>>>
>>>>
>>>
>>> Interesting. The machine has an Intel Pro100 on-board NIC that we
>>> disabled
>>> to installed a 3Com early in our troubleshooting. Our switch was stating
>>> we were having a low number of bad packets being sent out and we thought
>>> maybe the problems were related. The hang problem continued though and
>>> it
>>> brought us to find the failures and that increasing NSTRPAGES helped
>>> some.
>>> Makes me wonder if I should switch back to the on-board card.
>>>
>>> We increased NSTRPAGES to 8000 for now to see if it'll resolve the
>>> issue.
>
>>> I think I should still find something to monitor incoming
>>> packets. Thinking that maybe something on the network has
>>> starting hitting the system being the problem came up out of
>>> the blue.
>
>>> Anyone know of a good program? So far I've come across nmap,
>>> tcpdump, snoop and etherfind, but I'm not familiar with any of
>>> them so I don't know if one is better than the other.
>
>>> John.
>
>>May or may not be an issue, but in the past, letting both the
>>card and the switch auto-negotiate speed and duplex has caused
>>problems. Hard coding both (or just one if you can't do both) to
>>the appropriate settings (i.e 100Mbs/full-duplex) might help.
>
> Actually you have to let both auto-negotiate. If auto-negotiation
> fails you will find it defaulting to a mode which is usually not
> good.
>
> For a very good discussion on how things can go wrong and examples
> of different settings, see this link.
>
> http://www.cisco.com/warp/public/473/46.html
>
> It also points out that major problems occur when some vendors
> decided to add enhancements that aren't correctly specified.
>
> It's a longish document - but it is very good.
>
> Bill
>
> --
> Bill Vermillion - bv @ wjv . com

I wasn't thinking the auto-negotiate would cause something like this. I have
done that in the past, but I'm not seeing the optoin to lock this one down
anyway.

Below is the line for the current card in the system, but the same problem
was happening with the on-board Intel Pro 100 so I'm not convinced it's a
hardware/driver issue.

HW 3Com EtherLink 10/100 PCI (3C905B) - PCI Bus 1,Device 10,Function

I looked around over the weekend, but didn't find a good monitor for
OpenServer to collect packets. I could use a windows app instead I suppose.
Won't be too hard to pull the system off the switch and to a hub so I can
gather all traffic to it.

Thanks for the input.
 John.