Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: zsh mysteriously suspending job with sudo



On Sun, 2020-02-23 at 14:03 +0000, Stephane Chazelas wrote:
> 2020-02-22 18:09:13 -0700, Ronan Pigott:
> Can be reproduced with:
> 
> $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty'
> UID        PID  PPID  PGID   SID  C STIME TTY          TIME CMD
> chazelas 25430  8308 25430 25430  0 13:44 pts/1    00:00:00 /bin/zsh
> root     26867 25430 26866 25430  0 13:46 pts/1    00:00:00   sudo sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty
> root     26868 26867 26866 25430  0 13:46 pts/1    00:00:00     sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty
> root     26871 26868 26866 25430  0 13:46 pts/1    00:00:00       ps -jfHt /dev/pts/1
> 25430
> zsh: done                   sleep 1 |
> zsh: suspended (tty input)  sudo sh -c
> 
> As seen above, at the time "ps" is run (and awk later), the
> foreground process group of the terminal is 25430 which is the
> pgid of the main shell, not the pgid of the foreground job
> (26866), which is why that job gets a SIGTTIN when awk tries to
> read from the terminal (or SIGTTOU when pacman does an ioctl to
> the terminal).
> 
> From "strace", it seems it's because when "sleep 1" (the process
> group leader) finishes, zsh does a kill(-26866,0), presumably to
> check that the process group is still alive, but that fails with
> EPERM as there are processes running as root in that group, and
> then zsh changes the foreground process group back to the main
> shell's.
> 
> So it seems indeed to be a bug in zsh.
> 
> I suppose an easy fix would be to check for an errno of ESRCH
> when kill(-pgid,0) fails to make sure it's because the process
> group is gone.

Thanks, this is probably something like the following --- it hits a
number of other places doing something similar which look like they need
the same treatment, though I'm happy to be told otherwise, but the one
in signals.c is the key one here.

I don't know if we really need the different cases of killpg(pgrp, 0)
and kill(-pgprp, 0)?

> But, here the shell should be able to know that
> the job is not gone as sudo, a direct children of the shell has
> not returned yet, so there's probably something wrong with the
> logic in the first place.

I wouldn't have the first clue how to begin checking that there is some
appropriate set of processes associated with the process group still
there that would allow this in a sufficiently race-free manner, but if
anyone does have a clue they are welcome to try.

pws

diff --git a/Src/exec.c b/Src/exec.c
index 50027654a..0d321b757 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -1036,7 +1036,8 @@ entersubsh(int flags, struct entersubsh_ret *retp)
     } else if (thisjob != -1 && (flags & ESUB_PGRP)) {
 	if (jobtab[list_pipe_job].gleader && (list_pipe || list_pipe_child)) {
 	    if (setpgrp(0L, jobtab[list_pipe_job].gleader) == -1 ||
-		killpg(jobtab[list_pipe_job].gleader, 0) == -1) {
+		(killpg(jobtab[list_pipe_job].gleader, 0) == -1  &&
+		errno == ESRCH)) {
 		jobtab[list_pipe_job].gleader =
 		    jobtab[thisjob].gleader = (list_pipe_child ? mypgrp : getpid());
 		setpgrp(0L, jobtab[list_pipe_job].gleader);
diff --git a/Src/jobs.c b/Src/jobs.c
index e7438251e..182e65f39 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -283,7 +283,8 @@ handle_sub(int job, int fg)
 
 	    if ((cp = ((WIFEXITED(jn->procs->status) ||
 			WIFSIGNALED(jn->procs->status)) &&
-		       killpg(jn->gleader, 0) == -1))) {
+		       (killpg(jn->gleader, 0) == -1 &&
+			errno == ESRCH)))) {
 		Process p;
 		for (p = jn->procs; p->next; p = p->next);
 		jn->gleader = p->pid;
@@ -541,9 +542,13 @@ update_job(Job jn)
 
 	/* is this job in the foreground of an interactive shell? */
 	if (mypgrp != pgrp && inforeground &&
-	    (jn->gleader == pgrp || (pgrp > 1 && kill(-pgrp, 0) == -1))) {
+	    (jn->gleader == pgrp ||
+	     (pgrp > 1 &&
+	      (kill(-pgrp, 0) == -1 && errno == ESRCH)))) {
 	    if (list_pipe) {
-		if (somestopped || (pgrp > 1 && kill(-pgrp, 0) == -1)) {
+		if (somestopped || (pgrp > 1 &&
+				    kill(-pgrp, 0) == -1 &&
+				    errno == ESRCH)) {
 		    attachtty(mypgrp);
 		    /* check window size and adjust if necessary */
 		    adjustwinsize(0);
@@ -2469,7 +2474,8 @@ bin_fg(char *name, char **argv, Options ops, int func)
 		    if ((jobtab[job].stat & STAT_SUPERJOB) &&
 			((!jobtab[job].procs->next ||
 			  (jobtab[job].stat & STAT_SUBLEADER) ||
-			  killpg(jobtab[job].gleader, 0) == -1)) &&
+			  (killpg(jobtab[job].gleader, 0) == -1  &&
+			  errno == ESRCH))) &&
 			jobtab[jobtab[job].other].gleader)
 			attachtty(jobtab[jobtab[job].other].gleader);
 		    else
diff --git a/Src/signals.c b/Src/signals.c
index 96ff9e9b3..4adf03202 100644
--- a/Src/signals.c
+++ b/Src/signals.c
@@ -539,7 +539,8 @@ wait_for_processes(void)
 #endif
 		if (WIFEXITED(status) &&
 		    pn->pid == jn->gleader &&
-		    killpg(pn->pid, 0) == -1) {
+		    killpg(pn->pid, 0) == -1  &&
+		    errno == ESRCH) {
 		    if (last_attached_pgrp == jn->gleader &&
 			!(jn->stat & STAT_NOSTTY)) {
 			/*



Messages sorted by: Reverse Date, Date, Thread, Author