Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] PCRE/NUL: pass NUL in for text, handle NUL out



2017-06-16 20:10:49 -0700, Bart Schaefer:
> On Jun 16,  7:41am, Stephane Chazelas wrote:
> }
> } Solution for now in zsh is to escape like:
> } 
> }    [[ $x =~ "\b\Q${word//\\E/\\E\\\\E\\Q}\E" ]]
> 
> Hmm, wouldn't "\b\Q${(b)word}\E" be sufficient there?  In fact if
> you've applied ${(b)word} do you even need \E and \Q ?

Not really

Inside \Q...\E PCREs, only \E is special, and there's no
escaping you may do. It's like strong quotes. Changing ? to \?
would change the meaning of the regexp. And wouldn't help for \E

Outside of \Q...\E where what needs to be escaped on whether the
regexp has a (?x)), there are things like . or $ (or blanks with
(?x)) it would still leave unescaped.

PCREs (as opposed to some ERE implementations that have things
like \<, \=) are good though in that AFAICT, there are only \x
operators where x is an ASCII alnum, so adding a \ in front of
every ASCII non-alnum should be enough I would think (as long as
we're not inside [...] or things like \g{...}). So a an
equivalent of ${(b)var} for PCRE should not too difficult.

Quoting both ERE and PCRE is a problem in theory for (?x) and
blanks where "\ " is unspecified in ERE, but in practice, I
don't think any ERE implementation would ever have "\ " as a
special operator. So I think it should be a matter of quoting
only (and not more than):

ASCII [[:space:]]
$^*()+[]{}.?\|

(and again (from a security standpoint at least), that quoting
could be fooled in some locales like those that have BIG5-HKSCS
or GB18030 as the charset where some characters whose encoding
contains the encoding of other characters including ASCII ones).

-- 
Stephane



Messages sorted by: Reverse Date, Date, Thread, Author