Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: treatment of empty strings - why is this not a bug?



I was hoping to just let this thread go by, but maybe I'm the only one
still on list who's been around long enough to have an inkling of what
is going on here.

On Jan 13,  2:32am, Greg Klanderman wrote:
}
} I just don't understand why there should be any distinction
} between the splitting and dropping of empty strings.

It all goes back to the semantics of $@.

Here's bash:

$ set -- aa "" bb "" cc
$ echo $#
5
$ for x in $@; do echo $x; done
aa
bb
cc
$

Note that the empty elements of $@ are dropped.  Now, you might argue
that the *reason* they're dropped is because $@ has been joined into
a string and then re-split into words on $IFS, but the end result is
that empty elements disappear unless quoted.

Paul Falstad (original author of zsh) made a conscious decision that
a non-empty string, even one containing some or only characters that
appear in $IFS, was a significant item of data and should remain in
the expansion of $@.  However, he was unwilling to deviate from the
standard semantics of $@ to the point of treating empty strings as
significant.  Too many programs that deal with file names, for example,
would begin spewing errors if they received empty strings in their
argument lists when called as e.g. ls -l $@.

The point is to be minimally surprising to the person who just doesn't
want to think very hard about array and $IFS semantics, not to be
entirely logically consistent to someone who analyzes the behavior in
detail.  If you're enough of a geek to care, you're also enough of a
geek to figure out a workaround.

Everything else follows from there.  $a[@] is supposed to have the
same semantics as $@ for any array a; this must be, so that $argv[@]
is exactly equivalent to $@.

As to why things behave seemingly differently when using (s-:-) or
the like, well, in many cases zsh's parameter expansion was developed
looking only at the end results and not by paying close attention to
which "layer" of the implementation performed which operation.  As new
features were built on top of the implementation, particularly some of
the nested expansion tricks, unexpected oddities arose because that
layering became more important.  Some of those oddities became widely
enough relied on in scripts that we're now essentially stuck with them.

(Zsh is not one of "those" open source projects that is willing to
allow every new release to be "improved" by breaking everything that
went before.  For the most part, what has always worked still works,
and if it doesn't it's usually because it was never intended to work,
not because someone decided that the way it was once meant to work is
now philosophically wrong.)

The more literal explantion for your example has to do with multiple
consective separators being treated as a single word break (as happens
with, for example, multiple consecutive spaces when splitting on $IFS
in the shwordsplit case) vs. multiple separators being treated as
multiple word breaks.  However, I've forgotten whether rcexpandparam
was intentionally or accidentally given that side-effect.



Messages sorted by: Reverse Date, Date, Thread, Author