Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ



On 2020-06-06 at 04:33:50, Daniel Shahaf wrote:
> brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000:
> > On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:
> > > On 6/5/20, brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > zsh typically runs the final command in a pipeline in the main shell
> > > > instead of a subshell.  However, POSIX requires that all commands in a
> > > > pipeline run in a subshell, but permits zsh's behavior as an extension.
> > > 
> > > What POSIX actually says is:
> > > "each command of a multi-command pipeline is in a subshell
> > > environment; as an extension, however, any or all commands in a
> > > pipeline may be executed in the current environment"
> 
> That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12.
> 
> The part Brian quotes below is from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_01.
> 
> > > Ie, it does not say "shall", so it doesn't require a subshell all, in
> > > fact it explicitly does permit not using one as you also say. The
> 
> This interpretation is analogous to how conforming C programs must
> assume neither that «char» is signed nor that it is unsigned.

Right.  That term in C is "implementation defined."  POSIX has that term
as well, and it is not used here.  That term means that the
implementation may pick a behavior, but must document its choice.

> The sentence preceding the one you quoted reads:
> .
>     Non-standard extensions, when used, may change the behavior of
>     utilities, functions, or facilities defined by POSIX.1-2017.
> 
> I take this to mean non-standard extensions aren't bound by "shall"s.
> 
> As to why the passage Mikael quoted doesn't use the word "shall"… well,
> presumably it doesn't use the word "shall" because it doesn't describe
> "a feature or behavior that is mandatory"¹.

Sure, but if the standard didn't want that behavior to be specified
somehow, then it wouldn't have mentioned it.  Why wouldn't POSIX have
just omitted that statement and said nothing about it?

POSIX also says[0] that "[w]hen data is transmitted over the network, it
is sent as a sequence of octets (8-bit unsigned values)" and "16 and
32-bit values can be converted using the htonl(), htons(), ntohl(), and
ntohs() functions."  I don't think we can argue that POSIX permits one
to use 8-bit signed values or 9-bit values or that the implementation
can fail to make those functions work this way just because they didn't
use "shall".  The word "shall" is omitted (and "is" used) all over the
shell definitions to describe syntax forms, and one isn't permitted to
substitute some other syntax form in place of the standard one.

> > What POSIX does say is that one “shall define an environment in which an
> > application can be run with the behavior specified by POSIX.1-2017.”
> > I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
> > that reason.
> 
> What Mikael's saying is that zsh's incumbent behaviour is already
> POSIX-conforming, but POSIX-conforming implementations have some leeway:
> have a range of possible behaviours to choose from, just like conforming
> C compilers can choose what signedness to give to «char».

I don't agree.  That behavior is implementation defined, and that has a
specific meaning.  Certainly implementations can implement additional
extensions, provided they don't conflict with the behavior specified in
POSIX.

> The passage Mikael quoted specifies that running the last command in
> a pipeline in a subshell by default is permitted in certain cases,
> outlined by the phrases "as an extension" and "may".
> 
> The definition of "may"¹ says it's used to describe "optional" behaviours,
> and that conforming applications should tolerate both presence and
> absence of that behaviour.

It says that an "application should not rely on the existence of the
feature or behavior."  It doesn't say that we can't rely on the absence
of that feature in a conforming environment.

> To summarize, I don't see why behaviour specified with the phrases "as
> an extension" and "may" should be off by default in a POSIX-conforming
> mode.  Would you elaborate on this?

Because the behavior materially differs between the behavior specified
declaratively (albeit without "shall") and the extension.  If we were
talking about situations where the behavior was a choice between
producing an error (that is, just failing) and producing a useful
output, then clearly nobody would care: just don't rely on the program
failing if you give it the syntax specified in an extension.

For example, the shell is permitted to recognize additional arithmetic
expressions as an extension.  It would be permissible for the shell to
understand the legacy C-style expressions like =* (instead of *=), but
when in POSIX mode, the following would need to print -4:

  sh -c 'x=2; : $((x =- 4)); echo $x'

For behaviors where there is no conflict, such as =*, then we could
always print 8 here, even in a POSIX mode:

  sh -c 'x=2; : $((x =* 4)); echo $x'

> (On the other hand, I'm not sure why they bothered to write the words
> "as an extension" there.  They don't seem to change the meaning one way
> or the other.)

In general, we have to assume standards authors (and legislators) wrote
the text for a reason and not to be wasteful with words.  Therefore, we
should assume there is a relevant difference in meaning.

> Well, perhaps there is something we can do to make their lives easier.
> 
> Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char
> flags to help unportable programs.  However, I hesitate to propose
> adding an option just for this: adding options is always easy to
> suggest, but not always a good idea.
> 
> Since zsh already incorporates a parser for sh scripts, perhaps we could
> write a tool that automatically adds parentheses to the last element in
> every pipeline.  That's not such a crazy idea: it already exists (in a
> much more general form) for C: http://coccinelle.lip6.fr/

I think if your goal is for people to change their code to work around
this when zsh is sh, they will simply not do so, even if that's an
option, because it doesn't work by default.  In Git alone, there are
over 240,000 lines of shell between code and tests.  Debian must contain
tens of millions more.  It's just not going to be achievable to get all
of those lines changed to work this way.

If I were to add an option that were off by default for sh and on for
zsh, then that would meet my needs, and I'd be happy to implement that.
You seem to be unexcited about that possibility, though.

> > zsh is a very popular interactive shell, and allowing it to be used as a
> > portable sh on systems where the system sh is less capable would be
> > really beneficial.
> 
> How would it be beneficial?

It's already present on a lot of those systems and it avoids the need to
build one shell for interactive use and another for portable scripting.
zsh is also appealing as a portable sh because it has a pleasant
interactive mode, whereas many sh implementations (e.g., dash) do not.

> > If your objection is to the wording, I'm happy to revise it to remove
> > the word "requires", but I do think this provides a lot of benefits for
> > the sh scripting case while not impacting users who are expecting
> > different behavior for the zsh case.
> 
> The patch would constitute a backwards-incompatible change to anyone who
> uses zsh as sh today and relies on the current behaviour of pipelines.

The thing is, I don't believe anyone does, except for the possibility of
macOS[1].  I have tried zsh as sh on Debian and many things are broken
(including debconf).  I'm not aware of any other supported operating
systems[2] where a user using zsh as /bin/sh is permitted as an option.

I should also point out that when people write "emulate sh" that they
probably very much want to emulate the behavior of /bin/sh on their
system.  I'm not aware of any supported system in existence where the
default /bin/sh (or the default POSIX sh, when /bin/sh is not
POSIX-compatible) has the zsh behavior; they all run all pipeline stages
in a subshell.

I want to be clear that I don't want to change the behavior of the zsh
mode, where I agree a change would be undesirable and people are almost
certainly relying on the current behavior.

> This might have been acceptable if it were a question of changing
> a non-conforming behaviour to a conforming behaviour.  However, the
> current behaviour does appear to be conforming.

I'm not in agreement that a shell which provides only zsh's behavior is
conforming in this case.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html
[1] And macOS users are not relying on this behavior from zsh as sh
    because bash and dash are also valid sh options.
[2] That is, operating systems in versions which still receive security
    support from their vendor.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

Attachment: signature.asc
Description: PGP signature



Messages sorted by: Reverse Date, Date, Thread, Author