Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Another bug when suspending pipelines



On Sep 16,  1:33pm, Peter Stephenson wrote:
} Subject: Another bug when suspending pipelines
}
} To check my code for the other pipeline suspending problem, I came up
} with a command designed to check different cases, in particular where
} the left of the pipeline was still running:
} 
}   (sleep 5; print foo) | { sleep 5; read bar; print $bar; }
} 
} Suspending this sometimes doesn't work: the ^Z is delayed and gets
} relayed to the parent shell.

In this construct, the "read" is going to be running in the current
zsh; that is:

% (sleep 5; print foo) | { sleep 5; read bar; print GOT: $bar; }

should print "GOT: foo" even after ^Z/fg, whereas

% (sleep 5; print foo) | { sleep 5; read bar }; print GOT: $bar

should print "GOT: foo" if it runs un-interrupted, but should print
nothing if ^Z/fg.

Therefore the current (parent) shell is required to handle the TSTP in
this case.

} I have no handle on what aspect of this is problematic, but I don't
} see anything about the shell code above that suggests this is a
} particularly hairy case.

Let's tweak this slightly to be

  (sleep 6; print foo) | { sleep 7; read bar; print $bar }

just for clarity in the following description.

I believe what should be happening is:

* The "sleep 7" is initially the tty group leader, because it is the
  foreground job in the parent shell.  "sleep 6" and its subshell
  parent will not be in this process group because they were started
  before "sleep 7" existed; instead they are in the process group of
  the original parent.
* If the keyboard signal is generated during "sleep 7", sleep gets a
  SIGTSTP, is stopped, and the parent shell is notified by SIGCHLD.
  The parent reads the wait() status and finds that the foregroup job
  has stopped.
* The parent responds by stopping all the other jobs in the current
  pipeline.  It forks the brace expression into a new process and
  stops that one as well (or maybe that one stops itself, I don't
  recall the exact sequence of events here).
* The parent reclaims the tty leadership and returns to the prompt.

You can see this happen by adding "print $sysparams[pid]" on both
sides of the "sleep 7"; you will get a different PID before and after
the ^Z/fg.

So now we have two different process groups:  The "sleep 7" all by
itself, and the parent shell, whose group includes two more jobs,
the subshell and the rest of the brace expression.

The parent has to manage all three of these jobs, because it can't
hand off the already-forked "sleep 7" to the newly-forked brace job.
It has to arrange that the brace job can be sent a SIGCONT on "fg"
but not actually do anything until the "sleep 7" has finished; I
believe that's handled by the "synch" pipes and hidden reads; in any
case, it works.

On "fg" the parent:
* makes "sleep 7" the tty group leader again
* sends SIGCONT everywhere
* and waits for "sleep 7" to exit.

If I'm correct about the synch pipe, the tail of the brace expression
is blocked waiting to read that.  When "sleep 7" finally exits, the
parent:

* makes the tail of the brace expression into a process group which
  becomes the tty leader, and
* writes a byte on the synch pipe to wake it up.

And now we're finally back to the same state we'd have started in if
the brace expression had instead been a subshell, and the parent may
proceed with simply waiting for children.

"Particularly hairy"?

When the forked brace expression stops, the parent is going to be
notified about that again, possibly by SIGCHLD.  If it mis-handles
that, it could be the source of your observed "the ^Z is delayed"
and consequently cause an incorrect second pass through "stopping
all other jobs in the current pipeline".  But I can't reproduce
the double-stop to confirm that.



Messages sorted by: Reverse Date, Date, Thread, Author