Abnormal process kill.
From: Neil (nlombardospamtrap_at_rosbif.org)
Date: 05/09/03
- Next message: Mark: "Re: where is the zic source file for TZ=US/Eastern?"
- Previous message: A: "Re: Helpdesk solution?"
- Next in thread: Barry Margolin: "Re: Abnormal process kill."
- Reply:(deleted message) Barry Margolin: "Re: Abnormal process kill."
- Reply:(deleted message) Chuck Dillon: "Re: Abnormal process kill."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Date: Fri, 9 May 2003 10:07:11 +0200
Hi there,
I thought I'd post this, as frankly, I'm stumped. I have no idea and no
clue.
One of my customers is running an HPUX system on 11.00 and has a problem
in which processes die for no reason. The machine is also up to date
with all HPUX patches.
There are no core files, no log record and no specific user-id (i.e. the
process termination appears to be random).
Generally speaking, application processes do not simply die without
reason. My customer thinks that this may be performance related, but as
there are no logs or any other info, I'm doubtful.
Questions and answers that I have already checked out are below:
Q. Conditions such as memory depletion, where the process needs more
memory, but memory reservation or swap-space is not available.
A. This doesn't appear to be a problem, as the machine is not working at
full capacity.
Q. Violating per-process resource limit(s) like cpu-time limit.
A. This may be but, this problem also occurred on very calm days.
Q. Upon such condition, the operating system will signal this to the
process, and the process may act on this thru use of its
signal-handlers. In case that no signal-handler was setup by the
process, default action will be taken.
A. All the processes have signal-handler routines, which handle SIGKILL
type signals. In any case, all the processes do normal termination by
sending SIGKILL signals with kill <process_id>. (Authorized operators
are handling the terminations on main programs with menu options. Main
programs terminate their subprograms automatically.) For our problem
case, they seem to be terminated as if, by kill -9. kill -9 command
generates SIGABRT signal and as this signal is an "operating system
level signal", it couldn.t be handled in programs.
Q. But in general, when running program/scripts from command line, the
executing shell will receive a notification of a failed process.
A. All of our application programs run via the shell scripts running
inside the main programs in the background mode with nohup process_name
&.
During startup of our main programs, they push themselves into the
background mode with setpgrp() and fork() commands, just after
completing their initial controls.
Q. Such "core" files may *not* be created if the process' current
directory cannot be written, or if the application is running with
set-uid/set-gid bits (and the real user is different from the file
owner).
A. set-gid {setpgrp()} is only used in our 4 broadcasting programs and
these broadcasting programs do nothing during broadcast, rather,
subprograms do all the job. Such a problem hasn.t been encountered on
these programs, yet. Also they have the necessary rights on "current
directory".
But still, the processes could be altered to handle some of above
signals (SIGBUS, SIGSEGV, SIGXCPU). (Also it should be considered that
there are at about 60 processes subject to alteration).
Q. which process(es) is/are using much CPU? And what is the relation of
this with the unexpected termination of processes?
A. If we could find a relation of this with unexpected termination, we
should interfere in the problem with certain methods like separating the
functions of the processes or utilizing the function.
Q. under what user-id are (were) the affected processes running?
A. Operators run the broadcasting programs with the aid of built-in
menu.
Q. is there any application log(s) that provides information on process
termination?
A. Majority of our programs record their stop time into their individual
log files. But the programs subject to the process kill problem could
not record their stop time into log the file, they just die before.
Q. is there any "core" file generated (if no indication of core files:
is there any "core" file anywhere on the system)?
A. No, this has not been seen.
Q. are there any messages anywhere when a process terminates
unexpectedly?
A. They generate no message while dying.
Anyone seen this sort of problem before? Any ideas?
TIA,
Neil
- Next message: Mark: "Re: where is the zic source file for TZ=US/Eastern?"
- Previous message: A: "Re: Helpdesk solution?"
- Next in thread: Barry Margolin: "Re: Abnormal process kill."
- Reply:(deleted message) Barry Margolin: "Re: Abnormal process kill."
- Reply:(deleted message) Chuck Dillon: "Re: Abnormal process kill."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|