Re: unixware 7.1.4 sendmail issues




----- Original Message -----
From: "ThreeStar" <sco@xxxxxxxxxxxxxxxxx>
Newsgroups: comp.unix.sco.misc
To: <distro@xxxxxxx>
Sent: Thursday, August 30, 2007 6:38 PM
Subject: Re: unixware 7.1.4 sendmail issues


On Aug 30, 1:30 pm, Jean-Pierre Radley <j...@xxxxxxx> wrote:
Ron Kirschner typed (on Thu, Aug 30, 2007 at 04:12:29PM -0400):
| just discovered that no all mail local, uucp, and smtp is stuck in
mqueue
| for no apparent reason - syslog shows entries such as below:
| any idea what my problem might be?
|
|
|
| Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on
daemon Daemon0 : load average: 36
| Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on
daemon Daemon0 : load average: 36
| Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue
run -- load average too high
| Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on
daemon Daemon0: load average: 35

Please consider the possibility that the problem is exactly what you are
being told: the load average is too high.

--
JP
==>http://www.frappr.com/cusm<==

By which my esteemed colleague means that it seems like a sendmail
issue, not a Unixware issue. Sendmail's configuration file specifies
load limits beyond which it will merely queue requests or reject them
totally (QueueLA and RefuseLA, respectively). Not sure what
Unixware's defaults are but except in a dedicated mail server they'd
typically be well below the 35% range.

If you're sure that the pending mail is legitimate (that is, your
system isn't originating or relaying spam) you can force a temporary
override with the appropriate sendmail -O option or permanently change
the limits.

A load average of 35 is not 35% of anything. What it is is astronomical.
Something somewhere on the box is running away like crazy, or has been
building up over time. Forcing the mail to go might resolve it if it just
happens to be the mail server itself that is running away and making the
load average so high, but we don't know anything of the sort at this point.
It might be a cron job that's running every night and hanging every night,
thus adding 1 to the load average every day. Like a tape backup asking
/dev/null for a second tape. Or it might be a one time fluke event like a
report that generated a zillion emails or faxes or print jobs... Or it might
be a pc on the network with a virus slamming some service it found on the
server like the web server or mail or facetwin/samba/visiofs. Or maybe the
machine has a raid array in a degraded state running very slow causing
normal ops to pile up with no obvious problem processes to account for it.
Or maybe... anything.

So the first thing is find out what are the 35 or 36 processes trying to run
that the cpu can't find time to get to:

Find cpu hogs:
ps -eopcpu,pid,tty,args |sort
The worst offenders will be at the bottom of the list and don't worry about
the ones the scrolled off the top.

Or if you wanna get fancy yet can't install "top" from skunkware:

-----------------
#!/bin/ksh
# ptop - quick-n-dirty top process display
# usage: ptop [n]
# shows top n cpu hogs. Default is 10.
# brian@xxxxxxxxx

N=${1:-10}
while true ; do
clear
uptime
echo "Top $N CPU Hogs..."
echo "%CPU PID TTY COMMAND"
ps -eopcpu,pid,tty,args |sort -rn |head -$N
sleep 1
done
-----------------

Except the processes currently using the most cpu might not be offending
anything. So just look at that but don't read too much into it yet. Do look
at this though...

Find any non-sleeping processes:
ps -elf |awk '($2!="S"){print $0}'


Any non sleepers, that are eating cpu, and that have been running a long
time, are likely culprits.
Non-sleepers that aren't eating cpu might just be normal stuff that's being
held up by other stuff.
Be careful what you kill and why.

Brian K. White brian@xxxxxxxxx http://www.myspace.com/KEYofR
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!

.



Relevant Pages