fork() race in SIGCHLD handler
From: Chris Vine (chris_at_cvine--nospam--.freeserve.co.uk)
Date: 12/30/03
- Next message: john: "Re: Determine if open(2) created or opened?"
- Previous message: mallik: "Re: DW_FORM_strp pointing outside of .debug_str section"
- Next in thread: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Reply: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 30 Dec 2003 14:13:15 +0000
Hi,
In tracking down some unexpected behaviour in a program of mine, I have
noticed that the test code below exhibits a timing race on fork() in Linux
kernel 2.6.0. If the test code below illustrating the problem is compiled
and run, the first output is "child_pid is less than zero", all subsequent
lines being the expected "child_pid is greater than zero". (I get
"child_pid is greater than zero" at all times with kernel 2.4.23).
What is clearly happening is that by the time the child process has exited
and the SIGCHLD handler is executing in the parent process, the parent
process has not had its copy of the static child_pid variable modified by
having the return value of the fork() call assigned to it in its own
address space. The parent process after fork()ing is beginning after the
child process has already terminated.
At first I thought this may be a kernel or glibc bug, but on reflection I am
not so sure - I do not think the asynchronous SIGCHLD handler in the parent
process is required to wait around for the process in which it is executing
to begin executing synchronously after fork()ing. I can get around the
difficulty by putting a flag in the first line of the parent process after
the fork(), on which the SIGCHLD handler can block, but does POSIX say
anything about this? (Apologies for posting this to comp.unix.programmer
as well as comp.os.linux.development.system, but the comp.unix.programmer
regulars seem to know a lot about POSIX).
It is interesting though that the assignment of -1 to child_pid in the last
line of the signal handler does not cause subsequent calls to the handler
to exhibit the "unexpected" behaviour. Probably on all iterations after
the first the address space of the parent process has been set up so that
copies-on-write to its copy of child_pid take place quickly.
This is with glibc 2.3.2, and the compiler is gcc-3.2.3.
Chris.
//////////////////// test code ///////////////////
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
pid_t child_pid = -1;
void childexit_signalhandler(int sig) {
char message_less[] = "child_pid is less than zero\n";
char message_greater[] = "child_pid is greater than zero\n";
waitpid(-1, 0, WNOHANG); /* eliminate zombies */
if (child_pid < 0) write(0, message_less,
sizeof(message_less));
else if (child_pid > 0) write(0, message_greater,
sizeof(message_greater));
child_pid = -1;
}
int main(void) {
struct sigaction sig_act_chld;
int index;
struct timespec delay;
struct timespec residual_delay;
sig_act_chld.sa_handler = childexit_signalhandler;
sigemptyset(&sig_act_chld.sa_mask);
sig_act_chld.sa_flags = 0;
sigaction(SIGCHLD, &sig_act_chld, 0);
for (;;) {
child_pid = fork();
if (child_pid > 0) { /* parent process */
/* set up a 1s delay */
delay.tv_sec = 1;
delay.tv_nsec = 0;
while (nanosleep(&delay, &residual_delay) == -1)
delay = residual_delay;
}
if (!child_pid) { /* child process */
/* do something meaningless */
for (index = 0; index < 100; index++);
_exit(0);
}
}
return 0;
}
-- To reply by e-mail, remove the "--nospam--" in the address
- Next message: john: "Re: Determine if open(2) created or opened?"
- Previous message: mallik: "Re: DW_FORM_strp pointing outside of .debug_str section"
- Next in thread: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Reply: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|