Some servvices on my edge box slow to reply
From: clvrmnky (clvrmnky-uunet_at_coldmail.com.invalid)
Date: 09/21/04
- Next message: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Previous message: Dennis Davis: "Re: Antivirus software for OpenBSD?"
- Next in thread: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Reply: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Reply: Peter Matulis: "Re: Some servvices on my edge box slow to reply"
- Reply: clvrmnky: "Re: Some servvices on my edge box slow to reply"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 21 Sep 2004 13:16:02 -0400
OpenBSD 3.1, i386
Somehow between last night and this morning, nodes on the inside of my
firewall/router (referred to as "martini" from now on) are very slow
accessing some services.
I noticed the problem when I tried to ssh to martini this morning, and
the login took a very long time (on the order of 30secs.) I use
RSA-based authentication. I recalled that there was a FAQ item about
this very problem, and determined that the "UseDNS no" solved /this/
particular problem. I have an entry in my /etc/hosts for the single
external node I often connect with ssh from (again, with RSA auth) just
to shut SSH up. I've never had a problem with ssh from internal nodes
in this manner.
However, I also noticed that my IMAP connection to martini seemed to be
failing, as well as *some* web sites configured on martini. I run
Apache on this box, as well (the wisdom of running public services on an
edge box aside -- this problem manifested itself within the last 12
hours), with a few virtual sites all resolving to the same IP address.
For some reason, two of the sites take forever to load up via a browser
or curl (they will load eventually), using any of the internal or
external hostnames /or/ the IP address. The simple one-page virtual
sites I have parked come up fine.
Accessing these same sites via lynx on the server box itself is fine.
During my diagnostics, I noticed that "arp -a" also took forever to
return; "arp -na" returns promptly. So, it appears that reverse lookups
are failing in some manner. The problem is that I don't know how or
why. "netstat -r" is also really slow now -- mush slower than usual.
If I run nslookup against an internal DHCP supplied IP address, it takes
a long time to time-out (1m12.092 wallclock time!) I seem to recall
that it timed out much faster in the past. If I force nslookup to use
my ISPs nameserver, it takes 0m15.217s.
My connection to the world is a PPPoE/ADSL connection with a static IP
address. ppp has been coming up w/o error, and I have it set to abort
if I don't get the exact IP address I want on dialup.
Some additional details:
- martini runs a caching-only nameserver, and all nodes within the
network use this as the primary nameserver. All nodes have a secondary
nameserver provided by my ISP.
- Problem remains regardless of whether pf is enabled or not
- Reseting the ethernet switch on the inside NIC did not help.
- From a look at the number of semaphore and work files in /tmp, httpd
is waiting for something before it returns content. I'm guessing that
something related to PHP or mySQL is forcing a reverse lookup, causing
the server to stall. Disabling mod_gzip, PHP and database access does
not help, however.
- I have tweaked some sysctl's (related to TCP keepalives) a long time
ago, though this server has been running happily with these settings for
years.
- Hard reboot in desperation solved nothing
- I did make a single change to my /etc/hosts yesterday when I set up an
AirPort Express device as a bridging device, and gave it a static, local
IP address that belonged to an old host that has not been around for
years. I run DHCP for internal nodes, and reserved 10.0.0.1-10.0.0.10
and 10.0.0.254 for internal use. dhcpd is allowed to give out
10.0.0.11-10.0.0.50. The rest of 255.255.255.0 is unused. I only have
that single netmask (i.e., no subnetting.) For fun, I removed the
AirPort Express from the network. Only a single node can actually talk
to it via Wi-Fi, and it defaults to ethernet anyway.
A brief look with tcpdump shows that traffic is flowing, though replies
back to inside nodes for these specific services just seem to stall for
a long time prior to returning the requested data.
I seem to have exhausted my network smarts. I have no idea why things
would just start acting up. It feels like a reverse lookup issue, but
I'm at a loss to explain why it is happening now, and what I can do to
fix it.
The only working hypothesis I have is that my ISP has changed something,
like the IP of the nameservers. Since I run a caching-only nameserver,
I am NOT the official resolver for anything internal. DHCP-supplied IP
address have never been resolvable (of course), but the few hosts in
/etc/hosts are found first, with BIND filling in the rest. Maybe BIND
is stalling things?
Anyone spare a clue? I'm fresh out; all I have are doughnuts.
-- cm
- Next message: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Previous message: Dennis Davis: "Re: Antivirus software for OpenBSD?"
- Next in thread: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Reply: Dave Uhring: "Re: Some servvices on my edge box slow to reply"
- Reply: Peter Matulis: "Re: Some servvices on my edge box slow to reply"
- Reply: clvrmnky: "Re: Some servvices on my edge box slow to reply"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]