Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Subshell with multios causes hang

On Tue, 22 May 2007 12:21:43 +0100
John Buddery <jvb@xxxxxxxxxxxxxxxx> wrote:
> Hi, since upgrading from 2.4.5 to 2.4.6 I find that one of my
> functions which uses a multios redirect on a subshell list is
> hanging. I tried 4.3.4 as well with no luck.
> Essentially I run the equivalent of:
>    ( echo hello ) >| /tmp/out >| /tmp/out2
> and in an interactive shell (or any with job control) this hangs.
> All of the following fixes solve this problem, but I don't know what
> else they break:
>     Setting thisjob = -1 in clearjobtab(), since there is no current
> job, and making addproc() ignore the addition of aux processes if
> thisjob == -1. This also seems wrong, as we are completely loosing the
> pid information for the multios, so for example we can't kill it.
>     Setting thisjob = 1 in clearjobtab (if it was >= 0), and setting
> jobtab[thisjob].stat = STAT_INUSE after clearing jobtab. This is what
> I ended up with, but is it a valid thing to do ?

Thanks for the detailed analysis, which will have saved me hours.

There's clearly something of a design flaw here: we're using (an effect
of) job control when no job control is present.  However, the shell
does use the so-called job table for this purpose (managing processes
even if they're not strictly associated with a job), so we have to live
with it.

In that spirit what I'd *like* to suggest is something close to what you
came up with: set thisjob to -1 in clearjobtab() (it's sure as heck
invalid), and then when we need a job table entry in closemn(), detect
that thisjob is -1 and initialise a new job.

Problem 1: this happens before execpline() runs in the subshell, which
grabs a different job table entry.  The one generated by closemn() is
forgotten.  We can fix this by setting a temporary job number saying
"use me! use me!".  This isn't very nice but doesn't involve
redesigning the shell from scratch.

Problem 2: this is where it gets really nasty to the extent that I'm
worried I must be missing something basic about multios.  We now do the
"echo" in the subshell, and on return to execpline() wait for the
auxiliary process handling the multios to exit.  But it's never going
to!  It's waiting for end-of-file on the data it's reading from the
subshell that's waiting for it.  Because we attached the multios
process after the fork, we have deadlock.

Wossgoingon?  How do multios ever work?  Is there some call to close the
shell fd's (giving the EOF the aux proc is waiting for) that hasn't
quite been handled at that point, but usually has?

Possible clue: last1 is 1 in this version of execpline(), indicating
we're about to leave the shell.  The auxprocs are the only reason we
can't.  So there must be some solution...

I'll carry on looking at this when I get a chance, but for now I'm
confused enough to go to the beer festival.

Peter Stephenson <pws@xxxxxxx>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php

To get further information regarding CSR, please visit our Investor Relations page at http://ir.csr.com/csr/about/overview

Messages sorted by: Reverse Date, Date, Thread, Author