Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

FIFOs



Bart wrote:
> For guaranteed correct operation, we should remove the PATH_DEV_FD code
> from getproc() in exec.c, or (perhaps better) change it to be used only
> if mkfifo() is absent or fails.

This is easy (apart from the `or fails', which I haven't attempted to
implement --- unless that simply means `or the configure test for it
fails').  

There are two issues here, however (without a patch, you'll have to #undef
PATH_DEV_FD in exec.c to see them).  The first isn't too bad.

% echo <(echo foo)

Here the parent shell can, with the wind in the right direction, get back
and delete the file named by the <(...) before the child has had a chance
to open it (let alone call the code to fill it).  There's no easy way to
synch this, since you end up with deadlock --- the child can't open the
fifo until there's a process reading it.  This has happened to me a few
times.  It looks pretty unlikely if you stare at the code --- the open is
only a few instructions later in the child while the host is doing all the
normal command processing first --- but if you think about the scheduling
of forked-off child processes on heavily loaded machines (in this case SMP)
maybe it's not so surprising.

One good reason not to worry about this is that if the process actually
opens the fifo, that's guaranteed not to happen, i.e.

% cat <(echo foo)

always works.


The second thing is a killer, at least without a rethink.  In the case
first shown, where the fifo is never opened, but this time does still
exist, the zsh just hangs on for ever waiting for it and sits around
uselessly in the process table.  The second remark above still applies, but
this time the failure is less benign.  Maybe somebody understands this
better.  Anyway, I haven't sent a patch because of that.

I suppose this a system issue, since it's not obvious to me why the read
doesn't just fail when the fifo is deleted, at which point there's no
chance of anyone ever reading from it (this is Solaris 2.6).  It would be
reasonably safe to arrange for a timeout, but it would have to be set up
specially since poll() and select() won't work if we haven't yet got an fd.

-- 
Peter Stephenson <pws@xxxxxxxxxxxxxxxxxxxxxxxxx>
Cambridge Silicon Radio, Unit 300, Science Park, Milton Road,
Cambridge, CB4 0XL, UK                          Tel: +44 (0)1223 392070



Messages sorted by: Reverse Date, Date, Thread, Author