Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: localtraps and signal handling on NetBSD



On Fri, May 13, 2005 at 12:26:59PM +0100, Peter Stephenson wrote:
> Vincent Stemen wrote:
> > I have an idea, that might not be to difficult to implement, that I
> > think would solve the problem, unless you know of other problems it
> > would create.
> > 
> > If, inside the signal handler, I execute the trap statement on the
> > same signal, turn off the intrap status because the signal is now
> > re-trapped.
> 
> I don't think that's the problem.  intrap has a quite limited effect,
> determining how the shell behaves when already handling signals.
> 
> The vague suspicion is that the problem is based on whether or not the
> trap handler is run inside the signal handler, and whether the system
> resets the signal.  It's possible to tweak the queueing code, which
> would force more traps to be run after the handler exits rather than
> within it, but I'm really not sure enough about what's going on to
> suggest anything definite.

I am not sure I am following you correctly.  If, by handler, you are
referring to the shell script function signal handler, I don't think
queuing more signals to run after the handler exits will help.  The
fact that it waits until the handler returns before responding to the
next signal is the problem. Or are you referring to a handler in the
zsh source?

I did some more detailed multi-platform testing, that did not involve
Z shell's localtraps, to compare the behavior of different shells on
different platforms.  In this particular test, I focused on just the
handling of the same signal once you are inside a signal handler.  I
thought I would post the results in the hope that it might help.  I
used my machines and various machines available through the
SourceForge.net compile farm.  I got the same behavior on every
platform, including Linux.

Here is my test script:

# --------- <test script> -----------
sigterm1()
{
    trap 'echo "-- sigterm2 --"' TERM
    echo "sigterm1(): sending SIGTERM"
    kill -TERM $$
    trap sigterm1 TERM
    sleep 1
}

trap sigterm1 TERM

echo
echo "main: sending SIGTERM"
kill -TERM $$
echo "main: sending SIGTERM"
kill -TERM $$
# --------- </test script> -----------


Since I got the same result on every platform, we are not dealing with
any OS specific issues in this particular case.

Here are the shells on various platforms that I tested with that
correctly handled the signals and produced the output below.

PD KSH v5.2.14 99/07/13.2 on NetBSD 2.0
PD KSH v5.2.14 99/07/13.2 on Linux 2.4.21-27.0.1.ELsmp
ksh on SunOS x86-solaris1 5.9
ash on Linux 2.4.21-27.0.1.ELsmp
sh on NetBSD 2.0
sh on NetBSD 1.6.1
sh on FreeBSD 4.10-BETA
sh on OpenBSD 3.4
sh on SunOS x86-solaris1 5.9

<correct behavior>
main: sending SIGTERM
sigterm1(): sending SIGTERM
-- sigterm2 --
main: sending SIGTERM
sigterm1(): sending SIGTERM
-- sigterm2 --
</correct behavior>


The summary of the results is that the only two shells that do not
handle the signals correctly once in the handler are bash and zsh on
all platforms.  sh on Solaris and all the BSD's, ksh, and ash
(commonly available on Linux systems) behaved properly.

Here is the output of zsh and bash.

<zsh>
main: sending SIGTERM
sigterm1(): sending SIGTERM
sigterm1(): sending SIGTERM
sigterm1(): sending SIGTERM
sigterm1(): sending SIGTERM
... continues forever
</zsh>

I believe we thought that the signals were being handled differently
in zsh on Linux than BSD earlier, but in this test that does not
appear to be the case.  In all cases zsh does not process the next
signal until it exits the signal handler, so the trap following the
kill command resets the signal before the next signal is processed,
causing it to loop endlessly. I also got the same result on the
patched version of zsh 4.2.5 that we were testing with earlier.

<bash>
main: sending SIGTERM
sigterm1(): sending SIGTERM
main: sending SIGTERM
sigterm1(): sending SIGTERM
</bash>

The problem with bash is that it disables further signals all together
once it is in the signal handler and the trap statement does not
re-enable them.  Either problem can be a show stopper.


-- 
Vincent Stemen
Avoid the VeriSign/Network Solutions domain registration trap!
Read how Network Solutions (NSI) was involved in stealing our domain name.
http://inetaddresses.net/about_NSI.html



Messages sorted by: Reverse Date, Date, Thread, Author