Re: OSR505 Signal trapping in shell scripts
From: Ian Wilson (scobloke2_at_infotop.co.uk)
Date: 10/01/03
- Previous message: rpms: "Change the modeline"
- In reply to: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Next in thread: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Reply: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 1 Oct 2003 09:56:51 +0000 (UTC)
Wow - thanks Bela, problem solved.
further unimportant and skippable ramblings added below.
Bela Lubkin wrote:
> Ian Wilson wrote:
>
>
>>My apologies if this is insufficiently SCO specific :-)
>
>
> Can't tell if it is or isn't, but certainly it's a _programming_
> question...
>
>
>>I'm trying to get ain interactive script to clean up temporary files
>>when a terminal disconnects (telnet in this case).
>>
>>My test script is at the end.
>>
>>Here's what I get when I press the Interrupt key (stty intr)
>>2003-09-30 15:15:07 : Trapped signal INT
>>2003-09-30 15:15:07 : Trapped signal Normal Exit
>>
>>Here's what I get when I abruptly disconnect my terminal (exit telnet
>>client)
>>2003-09-30 15:15:28 : Trapped signal HUP
>>2003-09-30 15:15:28 : Trapped signal Normal Exit
>>2003-09-30 15:15:28 : Trapped signal HUP
>>
>>A file that should be deleted in a signal trap still exists (see script).
>>
>>Q: Why isnt my "rm" working and why is HUP recorded twice?
>
>
> I tested this in the same way: exiting telnet client. This might be
> significant.
>
> I learned a bunch of things, can't say I know _exactly_ what's going on,
> but at least there are workarounds.
>
> The proximate cause of not removing trapint.log is that your `rm`
> command is receiving a SIGHUP. Oddly, if I change it to `rm -f` then it
> does not receive a SIGHUP. I verified this with both `trace` and
> `truss`.
>
> One workaround is to put at the top of exithandler():
>
> trap "" 1 # ignore SIGHUP from now on
Thats what I used - thanks.
> I thought this might be due to ksh's complex process handling, so I
> tried with the Bourne shell. Besides the "#!/bin/ksh", I had to change
> the function declarations to the "exithandler() { ..." style. Your real
> program probably has other Korn-isms in it, but the test code worked
> under Bourne. It also behaved exactly the same, which was a surprise to
> me.
My real script started out as a C shell script, onintr is too limiting.
Yes I've read "CSH programming considered harmful" but I kinda got
pushed into it. Normally I use Korn shell.
My Kornised real script actually went into a near infinite loop of
SIGHUPs until it seemed to run out of resources. Your explanation of
STOPIO accounts for how my attempt at signal trapping and use of IO
caused this.
> I then changed the kernel parameter SECSTOPIO to 0 (I did so as follows
> -- live kernel brain surgery, albeit very simple:
>
> # scodb -w
> scodb> sec_stopio=0
> scodb> q
>
> ) -- and the problem went away. So this is an interaction with
> stopio(S). `telnetd` calls stopio() on the tty you were logged in on.
> This is a more persistent source of SIGHUPs than a mere hangup. If you
> were running the script on a serial terminal on which you had _not_
> logged in, merely opened the modem-control port and ran the script --
> then the problem wouldn't exist. If you had logged in then it probably
> would, because `login` or `init` would stopio() the port.
>
> You didn't say what release of OSR5 you're using.
OpenServer 5.0.5 - pending an upgrade to 5.0.7.
> I'm on 507. This
> behavior was actually must worse in the past (502 and earlier); on those
> releases, you would get a SIGHUP _every_ time you tried to access a file
> descriptor on which stopio() had been called. Starting in 504, a
> process only gets one SIGHUP per fd.
>
> The question remains, what was `rm` doing to that fd which provoked a
> SIGHUP? Checking the source... I see this:
>
> if ((!fflag && isatty(0)) || iflag) {
> /* then look up the localized regular expression for "yes" */
>
> Kind of a silly code path. It knows that `rm -i`, and `rm` without "-f"
> _when on an interactive terminal_, is potentially going to ask
> questions. Not that it _is_, only potentially. isatty(0) is going to
> end up doing ioctl()s on fd 0, which is your stopio()'d telnetd pty.
> Thus, doom.
>
> The workaround of doing "trap '' 1" will protect _anything_ you run out
> of the exit function. You could also specifically protect `rm` by doing
>
> rm trapint.pid >> trapint.log 2>&1 < /dev/null
>
> .... but that's much shakier, depends on information from inside SCO's
> `rm` source code that could change without your knowledge.
>
> So...
>
> Various workarounds (choose 1):
>
> # at top of exit function: ignore SIGHUP
> trap "" 1
I chose that one. Its also hinted at in various books I had consulted.
> # at top of exit function: close dangerous descriptors
> exec </dev/null >/dev/null 2>&1 # at top of exit function
>
> # protect just the rm command itself:
> rm trapint.pid >> trapint.log 2>&1 < /dev/null
>
> # disable stopio(S) system-wide:
> cd /etc/conf/cf.d
> ./configure
> "8" ... and set SECSTOPIO to 0
> # then relink, reboot
>
>>My apologies if this is insufficiently SCO specific :-)
>
>
> So, turned out to be very OpenServer-specific. I understand that HP-UX
> also has SecureWare stuff in it, but knowing how many years of drift we
> have between the two implementations, I doubt the issues are similar.
> Plus, it was enhanced by a specific silliness in OSR5 `rm`.
>
> But... regarding that specific silliness: it's only silly because you
> might have expected `rm` not to touch the tty. Contrast to some other
> code you might have in your exit function:
>
> echo Bailing! # the shell writes to stdout and is SIGHUP'd
> cat exit.msg # `cat` writes to stdout and is SIGHUP'd
> stty -a > trapint.exit-stty-settings # ioctl(stdin) and is SIGHUP'd
Yep.
> I still don't know precisely why the _shell_ saw SIGHUP twice (thus
> entered traphandler() twice). I can speculate a bit, but haven't set up
> the detailed tests to confirm this. I'm guessing that the shell might
> not have received a SIGHUP from telnetd directly; `perl` did because it
> was in the middle of an I/O when telnetd exited. So the perl statement
> ended, and the shell tried to do:
>
> echo "Shell sees result as $result"
>
> which would write to stdout, so it goes a SIGHUP relating to the
> stopio'd fd 1. Then it got into the trap handler and then the exit
> handler. The exit handler ran `rm` as:
>
> rm trapint.pid >> trapint.log 2>&1
>
> This involves _closing_ fds 1 and 2 (in the process of reopening them to
> "trapint.log"). I'm pretty sure the close(S) system call isn't
> sensitive to stopio(), but suppose the shell does something else on the
> way out. Maybe it flushes a stdio buffer; maybe it attempts an ioctl()
> to see what sort of file it's closing. When it does these things to fd
> 1, nothing special happens (it's already been SIGHUP'd for fd 1). When
> it does them to fd 2, kaboom.
>
> .... untested theory.
Well, I certainly feel I already got more than my money's worth :-)
Five stages of usenet posting:
Frustration
Composition
Expectation
Impatience
Joy
>
>>Bela<
>
>
>>Script trapint.ksh:
>>\/\/\/\/\/\/\/\/\/\/\/\/\/\/
>>#!/bin/ksh
>>
>>#
>># create a record of the existence of this process
>>#
>>echo $$ > trapint.pid
>>
>>#
>># trap interrupts etc
>>#
>>function exithandler {
>> mysignal=$1
>> echo `date "+%Y-%m-%d %H:%M:%S"` \
>> ": Trapped signal $1 " >> trapint.log
>> rm trapint.pid >> trapint.log 2>&1
>>}
>>function traphandler {
>> mysignal=$1
>> echo `date "+%Y-%m-%d %H:%M:%S"` \
>> ": Trapped signal $1 " >> trapint.log
>> exit
>>}
>>#
>># set traps individually
>>#
>>trap 'exithandler "Normal Exit" ' 0
>>trap 'traphandler "HUP" ' 1
>>trap 'traphandler "INT" ' 2
>>
>>#
>># invoke app
>>#
>>perl -e 'print "Enter: "; $ans=<>; print "\n<==$ans\n";'
>>
>>result=$?
>>echo "Shell sees result as $result"
>>
>>exit
>>\/\/\/\/\/\/\/\/\/\/\/\/\/\/
- Previous message: rpms: "Change the modeline"
- In reply to: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Next in thread: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Reply: Bela Lubkin: "Re: OSR505 Signal trapping in shell scripts"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|