Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] Use == in expressions instead of the deprecated =



2016-09-09 10:31:55 +0100, Peter Stephenson:
> On Fri, 09 Sep 2016 09:52:31 +0100
> Stephane Chazelas <stephane.chazelas@xxxxxxxxx> wrote:
> > It's still possible that the next major POSIX spec will have
> > [ == ] and maybe even [[ ]].
> > 
> > You guys may want to comment on the latest proposal there:
> > http://austingroupbugs.net/file_download.php?file_id=31&type=bug
> > 
> > as it would make zsh non-conformant.
> > 
> > In particular, it  proposes specifying [[ =~ ]] the ksh/bash
> > way, that is where quoting escapes regexp operators (
> > [[ a =~ "." ]] returns false) and [[ < ]] required to use
> > collation.
> 
> If POSIX-like shells already do it that way, the best we could offer
> would probably be another kludge-up option anyway.  Not specifying it,
> the only other option, isn't really doing anyone any favours in the end.
[...]

Note that doing it properly is tricky. bash has had a number
of bugs before it eventually settled to something reliable
enough.

bash -c '[[ "\\" =~ [^]"."] ]]'

Still return false though.

With zsh supporting both ERE and PCRE, it's going to be even
more pain I would say.

Specifying it (which the current proposal doesn't really
address) is also going to be difficult.

The problem is that the shell needs to know the syntax of the
regular expressions in order to be able to quote the RE
operators properly for the regexp engine.

for instance in [[ x =~ "." ]], the shell has to call
regcomp("\\."), but in [[ x =~ ["."] ]], it must call
regcomp("[.]"), as otherwise a regcomp("[\.]") would also match
on backslash. bash now does that properly in most cases (after I
raised the bug some time ago), but misses the [^]...] case as
I've just realised.

And in [[ x =~ "<" ]], you don't want to do a regcomp("\\<"), as
\< has a special meaning in some regexp engines. So bash does a
regcomp("<") (for both [[ x =~ "<" ]] and [[ x =~ \< ]])
instead. What that means is that you can't use the extensions of
your local regexp library unless you use a variable as in:

var='\<foo\>'
[[ $x =~ $var ]]

because then bash does a regcomp("\\<foo\\>") (just like zsh).

See also:

a='foo\'
[[ 'foo\bar' =~ $a"." ]]

That returns true in bash. (as it does a regcomp("foo\\\\.")),
though here we are asking for trouble in the first place with
that trailing backslash.

In effect at the moment to use =~ portably between ksh, bash,
bash31, zsh, you have to use:

regex=...
[[ $string =~ $regex ]]

And in between yash and zsh:

[ "$string" "=~" "$regex" ]

In PCRE, you have things like (?x) that affect the parsing and
would make things more complicated, but you might be able to
leverage \Q \E.

-- 
Stephane



Messages sorted by: Reverse Date, Date, Thread, Author