Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Possible signal handling issues

On Sat, 28 Dec 2013 15:02:34 -0800
Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> Both of these have been around since at least 4.2.0.  Consider this script:
> --- snip ---
> sleep 20 &
> TRAPINT() { set -x; kill -INT $$ }
> wait
> --- snip ---
> Run that in the foreground, kill it with ctrl+c, and watch the infinite
> loop.  Something to do with the "wait" command allows the INT to be re-
> queued for handling even when it is sent from inside an INT trap.  The
> signal_suspend() in zwaitjob() is constantly re-interrupted and never
> returns.

The following doesn't get us much further, but I'm not sure executing
"wait" is the key thing here: I think it might be more to do with the
fact that the job was first started in the background.  The following is
less than conclusive, however, since "fg" shares a lot of code with "wait".

I first tried modifying the above to

set -x # for earlier (de?)mystification
sleep 20 &
TRAPINT() { set -x; kill -INT $$ }

and then running with "zsh -fi" (something in my startup files is
causing a hang without the -f, which is irrelevant, but that's why I
went on and tried the other version below before I found that out).  I

../zwaitjob2.sh:3:> sleep 20
../zwaitjob2.sh:5:> fg %sleep
[1]  + 1792 running    sleep 20

^C reproducibly gives

+TRAPINT:0> set -x
+TRAPINT:0> kill -INT 2004
+TRAPINT:0> set -x
+TRAPINT:0> kill -INT 2004

Then I tried:

set -x
print $$
setopt monitor
sleep 20 &
TRAPINT() { set -x; kill -INT $$ }
fg %sleep

Initially I got

+../zwaitjob3.sh:2> print 1907
+../zwaitjob3.sh:3> setopt monitor
[1] 1910
+../zwaitjob3.sh:4> sleep 20
+../zwaitjob3.sh:6> fg %sleep
[1]  + running    sleep 20

This time ^C or "kill -INT 1907" doesn't do anything.  I'm not sure what's
going on here.  However, if I send "kill -INT 1910" (killing the forked
process) from outside I see some variable number of repetitions of

+TRAPINT:0> kill -INT 1922
+TRAPINT:0> set -x

and then the shell exits.

Consequently, this looks to me like some intrinsic race that happens to
be particularly reproducible in the "wait" case.  However, this still
seems to be different to the second problem.

I wondered whether the race was due to the places where signals were
being queued and unqueued, but haven't got anywhere down that route, and
I don't know why this is different when the process wasn't
backgrounded.  I have a vague memory we do something about blocking ^C
when starting a job in the background, though?


Messages sorted by: Reverse Date, Date, Thread, Author