Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Extending regexes



On Mon, Jul 4, 2022 at 6:53 AM Peter Stephenson
<p.w.stephenson@xxxxxxxxxxxx> wrote:>
> > On 04 July 2022 at 13:03 Sebastian Gniazdowski <sgniazdowski@xxxxxxxxx> wrote:
> > Zsh has extensions to regular regexes - the ~ and ^ negations.
>
> You're quite right both that they're very useful in zsh and there's nothing
> like this in normal regular expressions, but unfortunately I've got a strong
> feeling this is a big can of worms [hope that image is graphic enough that
> I don't need to explain the phrase for non-native English speakers].

In particular, these no longer fit the formal definition of "regular".

PWS correct me if I go too far astray, but (^Y) is internally (*~Y)
and (X~Y) is implemented by first matching (X) and then removing
anything that matches (Y) ... which is where the regular-ness goes
astray.  My formal training on this is more than a little rusty, but I
believe this means chaining together two finite-state machines rather
than building a single one.

On Mon, Jul 4, 2022 at 5:06 AM Sebastian Gniazdowski
<sgniazdowski@xxxxxxxxx> wrote:
>
> I think that regexes look pretty limited from this point of view and that pcre extensions went wrong path with the look forward and behind semantics.

Note that of course "pcre" stands for "perl-compatible RE" so you can
find the justifications for look-{ahead,behind} in the history of perl
development.  Again, a long time ago, but my recollection is that the
reason "lookaround assertions" are zero-width elements is to preserve
the finite-state semantics.  Please take that with 30 years worth of
salt grains (a less self-explanatory idiom than Peter's, I fear).




Messages sorted by: Reverse Date, Date, Thread, Author