Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Zsh signal handling



As you probably know the most buggy part of zsh is its signal handling
code.

First signals can be lost because zsh waits for child processes with
sigsuspend(~SIGCHILD) which means that all signals are blocked while
executing external commands.  That's why zsh executes a handler even if
only the childs receives a signal.  One related note: I guess that zsh will
not work on systems that have neither waitpid() nor wait3().  On these
systems it tries to use wait() and calls wait() in the SIGCHLD handler in a
loop until wait() fails.  This practically means that zsh do not exit the
SIGCHLD handler as long as any child is running.  With normal scripts this
does not cause any problem but it is quite problematic when there are
background jobs.  But perhaps we cannot do anything better on these
systems.  Perhaps there are no such systems out there.

The other big problem is that shell traps are called from the signal
handler which means that a signal handler can execute anything.  This of
course violates every standard and it is very dangerous since it assumes
that every function used by the shell either in the shell's code itself or
in the system's libc is reentrant.  A better implementation would be to
keep a count for each signal which is increased each time when that signal
is received and and act according to these counts in the normal execution
process.  POSIX says that when a signal is received while the shell is
waiting for the execution of a foreground command, the trap for that signal
is not executed until the command terminated.  If more than one signal is
received during that time, the order of execution of the traps is
unspecified.  Note that as I mentioned above zsh currently completely
ignores signals received while waiting for a foreground job.

We also have to find a way to handle child reaping in the SIGCHLD handler.
The current handler used and modifies the job table that's why we need to
block child signals quite often and do complicated pipe synchronizations.
The handler could place the child statistics to a queue independet from the
job table which can be processed in the normal execution flow.

One minor bug: zsh always resets the signal mask on startup which POSIX
says that the signal mask inherited from the parent should be passed down
to child processes except that SIGINT and SIGQUIT are always blocked for
asynchrous processes.

The pending signals can be checked and processed after each foregroung
pipelone termination.  Here we have to handle untrapped INT signals quiting
from any loops and shell functions.

Time consuming builtins can occasionally check the SIGINT count and can
terminate safely on interrupt.  A lot of code can be simplified as we do
not have to worry any more about unexpectedly changed static and global
variables.  execsave()/execrestore() can be removed.

Of course this means that large parts of jobs.c/signals.c/exec.c have to be
rewritten.  Unfortunately this code is quite hard to understand and
probably noone knows exactly how this works in zsh.  There may be many
compatibility problems here.  I think it would be quite important to fix
this area, since this is really the most buggy part in zsh.  Many
developers may think that zle_tricky.c/lex.c/subst.c/parse.c are more buggy
and ununderstandable but I do not agree here, but that's perhaps because I
worked a lot on these and I may undenstand it better than others.  But in
exec.c/jobs.c/signals.c there are code pieces which I do not understand at
all (but I did not try it very hard so it is probably just a question of
time).  I'd be glad if one would volunteer to start this cleanup.  I'll not
have much time for hacking in february so I will not start it now.

Zoltan



Messages sorted by: Reverse Date, Date, Thread, Author