Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] problem with 'ls | less' shell function



On Mon, Nov 7, 2022 at 12:44 AM Jun T <takimoto-j@xxxxxxxxxxxxxxxxx> wrote:
>
> [1] ( zsh -fc 'print a few words; /bin/sleep 10'; )

Missing the | { head -n -1 } here, but I think still confirmed working?

> [2] { sleep 10; sleep 11; } | { sleep 20; sleep 21; }
> [3] dir () { ls | less; }; dir
>
> Now both ^C and ^Z/fg fork for [1], ^C works for [2],
> and ^Z/fg works for [3] ({3] can't be killed by ^C but it's OK).
>
> But ^Z/fg still doesn't work for [2]; the pipeline never finishes.
[...]
> In the case of [2], the main zsh (zsh0) always fork() a subshell (zsh1)
> for the left hand side of the pipe, and zsh1 fork/exec 'sleep 10'.
> By hitting ^Z, zsh0 fork() another subshell (zsh2) for the right hand
> side of the pipe, but it seems zsh2 is not causing the problem.

Hmm.  If I examine "pstree" I find:

zsh(43919)---zsh(43978)---sleep(43980)

43978 is your "zsh2" and is busy-waiting with 100% CPU at

#0  0x000055ca5bc4c042 in execpline (state=0x7ffc311ce770, slcode=4098, how=2,
    last1=0) at exec.c:1789
#1  0x000055ca5bc4abc3 in execlist (state=0x7ffc311ce770, dont_change_job=1,
    exiting=1) at exec.c:1444
#2  0x000055ca5bc477d7 in execcursh (state=0x7ffc311ce770, do_exec=1)
    at exec.c:450

43919 is "zsh1" and is here:

#0  0x00007fcadf04045c in __GI___sigsuspend (set=0x7ffc311cd9b0)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:26
#1  0x000055ca5bcbf050 in signal_suspend (sig=17, wait_cmd=0) at signals.c:393
#2  0x000055ca5bc7c31a in zwaitjob (job=1, wait_cmd=0) at jobs.c:1629

> After 'fg', zsh1 (not zsh2) is using 100% of a CPU core, and
> when 'sleep 10' exits it is left as "defunct" (not waited for).

pstree for that zombie sleep shows it's a child of 43919 (zsh1) and
hasn't been cleaned up yet because zsh1 is waiting on zsh2.

zsh2 believes it's going to get signaled for the sleep, but it won't.

> > 2022/10/22 6:22, Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Oct 21, 2022 at 12:41 AM Jun T <takimoto-j@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> The subshell have missed the SIGCHLD?
> >
> > The subshell will never get the SIGCHLD, because it is a sibling of
> > the pipeline, not its parent.
>
> Do you mean the subshell for the right hand side of the pipe?

Yes, that is what I mean.  The zombie sleep and zsh2 are siblings.

> Strace output suggests that zsh1 does not receive SIGCHLD when
> 'sleep 10' finishes. Maybe SIGCHLD is blocked?? (I'm not sure).

#2  0x000055ca5bc7c31a in zwaitjob (job=1, wait_cmd=0) at jobs.c:1629

job=1 is zsh2:

(gdb) p jobtab[1].procs[0]
$2 = {next = 0x55ca5dca7690, pid = 43978,

waitjob() in zsh1 will never return because zsh2 is in an infinite
loop waiting for a child that doesn't exist.

> zsh1 is repeatedly calling hasprocs(list_pid_job) at line 1790 in
> exec.c:

This is actually zsh2.  As you noted, hasprocs() returns zero, so we
fall through to

1902            else if (subsh && jn->stat & STAT_STOPPED)
1903                thisjob = newjob;
1904            else
1905                break;

The condition at 1902 is true because jn->stat refers to the zombie
sleep, which WAS stopped when zsh2 was forked (but is not actually
even a job of this new subshell).  So we go around the loop again at

1778            for (; !nowait;) {

When zsh2 was forked during "sleep 20" , we should have hit this

1899                break;

so we must have re-entered at line 1778 after starting the "sleep 21"
and are now incorrectly waiting for the left side of the pipeline.




Messages sorted by: Reverse Date, Date, Thread, Author