Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Multio deadlock (Re: multios doesn't work with 2>&1)



On Oct 27, 11:27am, Bart Schaefer wrote:
}
} } echo foo >/dev/null 2>&1 | sed 's/foo/bar/'
} } 
} } gives a different bad effect, namely you get the output you want but the
} } shell hangs
} 
} The parent shell is in zwaitjob(), as is the shell that spawned sed.

That should actually say "as is the shell that was forked for echo".
The parent shell is waiting for "sed".

Here's the spot where the "echo" shell is stopped:

	/*
	 * So what's going on here then?  Well, I'm glad you asked.
	 *
	 * If we create multios for use in a subshell we do
	 * this after forking, in this function above.  That
	 * means that the current (sub)process is responsible
	 * for clearing them up.  However, the processes won't
	 * go away until we have closed the fd's talking to them.
	 * Since we're about to exit the shell there's nothing
	 * to stop us closing all fd's (including the ones 0 to 9
	 * that we usually leave alone).
	 *
	 * Then we wait for any processes.  When we forked,
	 * we cleared the jobtable and started a new job just for
	 * any oddments like this, so if there aren't any we won't
	 * need to wait.  The result of not waiting is that
	 * the multios haven't flushed the fd's properly, leading
	 * to obscure missing data.
	 *
	 * It would probably be cleaner to ensure that the
	 * parent shell handled multios, but that requires
	 * some architectural changes which are likely to be
	 * hairy.
	 */
	for (i = 0; i < 10; i++)
	    if (fdtable[i] != FDT_UNUSED)
		close(i);
	closem(FDT_UNUSED);
	if (thisjob != -1)
	    waitjobs();
	_exit(lastval);

Obviously we've not succeeded in closing all the necessary descriptors.
Here's what's still open (PID 16361 == parent, 16383 == echo, 16384 ==	/*
	 * So what's going on here then?  Well, I'm glad you asked.
	 *
	 * If we create multios for use in a subshell we do
	 * this after forking, in this function above.  That
	 * means that the current (sub)process is responsible
	 * for clearing them up.  However, the processes won't
	 * go away until we have closed the fd's talking to them.
	 * Since we're about to exit the shell there's nothing
	 * to stop us closing all fd's (including the ones 0 to 9
	 * that we usually leave alone).
	 *
	 * Then we wait for any processes.  When we forked,
	 * we cleared the jobtable and started a new job just for
	 * any oddments like this, so if there aren't any we won't
	 * need to wait.  The result of not waiting is that
	 * the multios haven't flushed the fd's properly, leading
	 * to obscure missing data.
	 *
	 * It would probably be cleaner to ensure that the
	 * parent shell handled multios, but that requires
	 * some architectural changes which are likely to be
	 * hairy.
	 */
	for (i = 0; i < 10; i++)
	    if (fdtable[i] != FDT_UNUSED)
		close(i);
	closem(FDT_UNUSED);
	if (thisjob != -1)
	    waitjobs();
	_exit(lastval);

Obviously we've not succeeded in closing all the necessary descriptors.
Here's what's still open:

zsh     16361 16361 schaefer    0u   CHR  136,3         5 /dev/pts/3
zsh     16361 16361 schaefer    1u   CHR  136,3         5 /dev/pts/3
zsh     16361 16361 schaefer    2u   CHR  136,3         5 /dev/pts/3
zsh     16361 16361 schaefer   10u   CHR  136,3         5 /dev/pts/3
zsh     16383 16383 schaefer    2w  FIFO    0,7      1227018 pipe
zsh     16384 16383 schaefer   12w  FIFO    0,7      1227016 pipe
zsh     16384 16383 schaefer   13w   CHR    1,3         2056 /dev/null
zsh     16384 16383 schaefer   14r  FIFO    0,7      1227018 pipe
sed     16385 16383 schaefer    0r  FIFO    0,7      1227016 pipe
sed     16385 16383 schaefer    1u   CHR  136,3            5 /dev/pts/3
sed     16385 16383 schaefer    2u   CHR  136,3            5 /dev/pts/3

16361 is the parent, it's clean.  16383 is echo and 16384 is the multio.
The multio is blocked reading fd 14 (1227018 pipe), which it's parent
still has open as stderr because fdtable[2] == FDT_UNUSED.

Does the following look right?  It does fix the deadlock, but we might
call close() on an already closed fd, which it appears this is trying
to avoid (maybe so as not to change errno?).

diff --git a/Src/exec.c b/Src/exec.c
index 99c7eaa..7ac1ad5 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -3372,7 +3372,7 @@ execcmd(Estate state, int input, int output, int how, int last1)
 	 * hairy.
 	 */
 	for (i = 0; i < 10; i++)
-	    if (fdtable[i] != FDT_UNUSED)
+	    if (i < 3 || fdtable[i] != FDT_UNUSED)
 		close(i);
 	closem(FDT_UNUSED);
 	if (thisjob != -1)



Messages sorted by: Reverse Date, Date, Thread, Author