Processes in <exiting> state
- From: "Tom Einertson" <tome@xxxxxxxxxxxxxxxx>
- Date: Sat, 1 Apr 2006 16:04:01 -0600
We have been running into problems with processes getting stuck in the
<exiting> state and consuming a lot of system time.
We seem to trigger this problem by using ^C to terminate one of our
applications. This doesn't always kill the process, though, so then we
do a 'kill -9' on the process. The process doesn't exit, though. At
this point the process shows as <exiting> on 'ps -ef'.
UID PID PPID C STIME TTY TIME CMD
- 26398 - - - <exiting>
These processes aren't defunct - they are still running, although you
can no longer see how much time they are accumulating. vmstat shows no
free time on the processor - it is spending all its time in system
mode. I ran a system trace and it appears that these processes are
stuck trying to write to a socket over and over again. The following
is typical.
MBUF m_getclustm canwait=M_WAIT type=MT_DATA callfrom=00150DF0
callfrom2=00000
000 pid=29016 ()
MBUF return from m_getclustm mbuf=70450900 dataptr=70160720
TCP tcp_usrreq so=704F8800 req=00000009 m=70450900 nam=00000000
MBUF m_free mbuf=70450900 dataptr=70160720 callfrom=00125DD8
callfrom2=0000000
0 pid=29016 ()
MBUF return from m_free mbuf=70450900
TCP tcp_output tp=704F89F0 so=704F8800
TCP tcp_output tp=704F89F0 so=704F8800
TCP tcp_usrreq_err so=704F88t -
From this it appears that these processes are trying to write toa socket. I can find the corresponding connection using netstat -Ao,
and it shows that the connection is in the CLOSE_WAIT state. So the
server has already closed the other end of the connection. We seem
stuck trying to respond or to close our end of the connection.
704f89f0 tcp4 4 0 loopback.35575 loopback.10005
CLOSE_WAIT
so_state: (ISCONNECTED|CANTRCVMORE|NBIO)
timeo:0 uid:31017
so_special: (LOCKBALE|MEMCOMPRESS|DISABLE)
so_special2: (PROC)
sndbuf:
hiwat:0 lowat:0 mbcnt:0 mbmax:0
sb_flags: (LOCK)
rcvbuf:
hiwat:67424 lowat:1 mbcnt:256 mbmax:269696
sb_flags: (SEL)
TCP:
mss:16856 flags: (NODELAY)
Has anyone encountered this problem before, or do you know what causes
it? A search in Google and on IBM's web sites show some problems with
TTYs causing similar symptoms on AIX back in 1996 and 1997, but nothing
about sockets and nothing more recent than that.
Or failing an explanation, do you have any idea how to track this
problem down further?
--
Tom Einertson E-mail: tome@xxxxxxxxxxxxxxxx
SIEMENS Power Transmission & Distribution Phone: (952) 607-2244
Energy Management & Automation Division Fax: (952) 607-2018
10900 Wayzata Boulevard, Suite 400
Minnetonka, MN, 55305
.
- Follow-Ups:
- Re: Processes in <exiting> state
- From: Jurjen Oskam
- Re: Processes in <exiting> state
- Prev by Date: Re: Direct connect via nic card to another aix server
- Next by Date: Re: lost+found permission
- Previous by thread: Re: Direct connect via nic card to another aix server
- Next by thread: Re: Processes in <exiting> state
- Index(es):
Relevant Pages
|