Re: fork() race in SIGCHLD handler
From: Chris Vine (chris_at_cvine--nospam--.freeserve.co.uk)
Date: 12/30/03
- Next message: Shuqing Wu: "Re: delay on select()"
- Previous message: john: "Re: Determine if open(2) created or opened?"
- In reply to: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Next in thread: Kasper Dupont: "Re: fork() race in SIGCHLD handler"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 30 Dec 2003 16:31:51 +0000
P.T. Breuer wrote:
[snip]
> Well, this sounds a bit like the old issue of "who runs first". I
> presume this is UP?
It is UP, yes, but the scheduler will provide reasonable equality of
processor time and I expect the real issue is what process has to accept
the delays on copy-on-write, and that that has changed between kernel 2.4
and kernel 2.6.
>> address space. The parent process after fork()ing is beginning after the
>> child process has already terminated.
[snip]
> Well, in fact the handler executes (because the child dies)
> before the return value from the fork is known!
>
> It sounds like you should delay setting the handler in the parent till
> the fork has returned, but that leaves a window in which the child can
> die unattended.
I cannot do that, because in the test code I know that the child will return
before the parent fork() has returned, so I will definitely miss it.
>> I can get around the
>> difficulty by putting a flag in the first line of the parent process
>> after the fork(), on which the SIGCHLD handler can block, but does POSIX
>> say
>> anything about this? (Apologies for posting this to comp.unix.programmer
>
> Dunno. I imagine so.
On further thought, of course my idea will not work - the parent and the
signal handler are in the same thread of execution, so blocking the handler
will block the parent.
I suspect the proper approach is to mask off SIGCHLD before fork()ing, and
then unmask it again in the parent once the fork() has returned. I have
just tried that and it appears to work as expected. Phew!
>> It is interesting though that the assignment of -1 to child_pid in the
>> last line of the signal handler does not cause subsequent calls to the
>> handler
>> to exhibit the "unexpected" behaviour. Probably on all iterations after
>> the first the address space of the parent process has been set up so that
>> copies-on-write to its copy of child_pid take place quickly.
>
> Unknown by me. Compiler issues will intervene to make that cloudy.
>
>> This is with glibc 2.3.2, and the compiler is gcc-3.2.3.
>
> Well try a different compiler for a start! Though I don't know if it
> will make a difference, since so far life sounds vaguely within spec.
> But mapping the space would be useful. Try no optimization.
Actually, I don't think the compiler is the issue - the explanation for this
behaviour is indentified by both of us above and should be compiler
independent. I have tried the same test code on another machine with
gcc-2.95.3, and with kernel 2.6.0 I get the same problem as with gcc-3.2.
It also occurs with or without optimisation. Only the mask/unmask approach
seems to guarantee correct results.
>> pid_t child_pid = -1;
>
> It makes me nervous not having this static. Shrug. SHouldn't you
> declare this volatile too?
I probably should, but having changed it to volatile the effect is still the
same.
Chris.
-- To reply by e-mail, remove the "--nospam--" in the address
- Next message: Shuqing Wu: "Re: delay on select()"
- Previous message: john: "Re: Determine if open(2) created or opened?"
- In reply to: P.T. Breuer: "Re: fork() race in SIGCHLD handler"
- Next in thread: Kasper Dupont: "Re: fork() race in SIGCHLD handler"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|