Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Deadlock when receiving kill-signal from child process



On Aug 6, 11:24am, Mathias Fredriksson wrote:
} Subject: Re: Deadlock when receiving kill-signal from child process
}
} On Thu, Aug 6, 2015 at 8:06 AM, Bart Schaefer wrote:
} }
} } I played around with this a bit by hacking loop() but the effect is
} } that with the test script Mathais provided, most of the USR1 signals
} } are just thrown away (they collapse into a single call to the trap
} } handler).  Not sure if that's actually the desired effect.
} 
} I would imagine some might rely on every signal being handled, e.g.
} keeping a count.

Even without spending most of the time in queuing, *some* of the signals
get dropped at the OS level.  The only way I can get them all to be
tallied is to remove the "sleep" from the trap function.

} The following traces have the last patches applied (I did multiple
} runs to see if I could hit different states):
} 
} #15 0x000000010df38e63 in runshfunc ()
} #16 0x000000010df38936 in doshfunc ()

This is confirms my suspicion about doshfunc().  Sadly it's called all
over the place, sometimes with signals explicitly un-queued and other
times with no change to the surrounding context.

} #0  0x00007fff8abfe166 in __psynch_mutexwait ()
} #1  0x00007fff8e4b578a in _pthread_mutex_lock ()
} #2  0x00007fff82ce5750 in fputc ()
} #9  <signal handler called>
} #22 <signal handler called>
} #32 <signal handler called>
} #34 0x00007fff8e4b5714 in _pthread_mutex_lock ()
} #35 0x00007fff82ce43a3 in ferror ()

This is the stdio thing again.  Anyone reading this familar enough with
the POSIX or C standards to point to whether stdio is required to be
signal-safe with pthreads?  I.e., is this our bug or someone else's?

(Not that zsh is using threads, but stdio is using pthread mutexes.)

} setopt NO_ASYNC_TRAPS:

NO_TRAPS_ASYNC ?

Anyway, same two issues as above, just slightly different paths (no
multiple signal handers in the stdio case, but one is enough).

As with the previous dotrapargs() patch, I'm a little nervous about
the dont_queue_signals() bits, but that's the only safe way to do
the disabling part of signal queueing when the enabling part is not
in local scope.

diff --git a/Src/exec.c b/Src/exec.c
index 7612d43..2886785 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -4820,11 +4833,9 @@ execshfunc(Shfunc shf, LinkList args)
     if ((osfc = sfcontext) == SFC_NONE)
 	sfcontext = SFC_DIRECT;
     xtrerr = stderr;
-    unqueue_signals();
 
     doshfunc(shf, args, 0);
 
-    queue_signals();
     sfcontext = osfc;
     free(cmdstack);
     cmdstack = ocs;
@@ -5039,6 +5050,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
     static int funcdepth;
 #endif
 
+    queue_signals();	/* Lots of memory and global state changes coming */
+
     pushheap();
 
     oargv0 = NULL;
@@ -5261,6 +5274,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
     }
     popheap();
 
+    unqueue_signals();
+
     /*
      * Exit with a tidy up.
      * Only leave if we're at the end of the appropriate function ---
@@ -5296,7 +5311,7 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
 mod_export void
 runshfunc(Eprog prog, FuncWrap wrap, char *name)
 {
-    int cont, ouu;
+    int cont, ouu, q = queue_signal_level();
     char *ou;
 
     ou = zalloc(ouu = underscoreused);
@@ -5305,7 +5320,9 @@ runshfunc(Eprog prog, FuncWrap wrap, char *name)
 
     while (wrap) {
 	wrap->module->wrapper++;
+	dont_queue_signals();
 	cont = wrap->handler(prog, wrap->next, name);
+	restore_queue_signals(q);
 	wrap->module->wrapper--;
 
 	if (!wrap->module->wrapper &&
@@ -5320,7 +5337,9 @@ runshfunc(Eprog prog, FuncWrap wrap, char *name)
 	wrap = wrap->next;
     }
     startparamscope();
+    dont_queue_signals();
     execode(prog, 1, 0, "shfunc");
+    restore_queue_signals(q);
     if (ou) {
 	setunderscore(ou);
 	zfree(ou, ouu);



Messages sorted by: Reverse Date, Date, Thread, Author