Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Extending regexes



On 2022-07-04 at 14:03 +0200, Sebastian Gniazdowski wrote:
> Zsh has extensions to regular regexes - the ~ and ^ negations. They, as it
> can be expected from negations that are required by Turing universal
> machines, introduce a whole new universe of computations over standard
> regular expressions. For example matching in an AND fashion:

For clarity: zsh has long had the module zsh/pcre, providing
-pcre-match; when the =~ regexp matching operator was added, we
deliberately chose to add a module zsh/regex to use the system ERE
libraries with -regex-match and made that the default implementation
behind the =~ operator.

If you're getting PCRE semantics, then probably somewhere in your
startup files you have something like `setopt re_match_pcre`.

A while back I wrote some bindings for using the RE2 library, which
matches the efficient regexps found in Go and which is licensed such
that more vendors might enable it by default with zsh.  I stopped as I
tried to puzzle through how to dig myself out of my own hole, in having
made `RE_MATCH_PCRE` be a simple boolean.

My _tentative_ thinking, which I'd appreciate feedback on, is to
introduce a new special parameter, `ZSH_EQTILDE_ENGINE` or somesuch;
have that only succeed when assigned a parseable value, and make
mutations of the RE_MATCH_PCRE be implicit assignments of `regex` or
`pcre` to that parameter.

Is this sane?  Are we happy introducing new special parameters, as long
as the name starts `zsh`?  Should the semantics just be "name of a
module" or a static list?  If "name of a module" then that would let
people do more than just use our engines (at their own risk), but should
we then update the .mdd files or the exported tables with some new
identifier to mark "use this function to back =~ when the engine points
here"?

I would quite like to move towards being able to expect "better, but
sane" REs to be available, even with commercial OS vendor builds of zsh.
I think RE2 is probably the best way forward, but ... I should probably
have asked long ago for advice on the design decisions which need to be
made.

-Phil




Messages sorted by: Reverse Date, Date, Thread, Author