Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [BUG] Long line makes pattern matching (by //) hog Zsh



0On Sun, 5 Jun 2016 12:10:20 -0700
Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:

> On Jun 5,  4:36pm, Sebastian Gniazdowski wrote:
> }
> } 1. not backslash nor slash nor space [^ /\\\\]##
> } 2. not number, slash, backslash, space [^0-9/\\\\ ]##
> } 3. not slash, backslash [^/\\\\]#
> } 4. end of line (#e)
>
> It's in the block in pattern.c:patchmatch() that begins with the
> comment:
> 
> 		/*
> 		 * Lookahead to avoid useless matches. This is not possible
> 		 * with approximation.
> 		 */

That comment happens to be an irrelevance --- that's where it's looking
for something that's an exact match to follow.  There isn't one in this
case.  If there was, we would have latched onto it and we wouldn't need
to try quite so hard rearranging the other elements of the pattern.

The containing block is where it's handling # and ##, so there's no
great surprise it's'= there.

> Specifically, in the "if (no >= min) for (;;) ..." loop, at each charater
> in the input string patmatch() is called recursively to look at the rest
> of the string, which again enters this same loop because the next thing
> is also a one-or-more expression, which calls recursively and again
> enters the loop because the thing after that is a zero-or-more.

The problem is the patterns are pathological.  Each of them can match
the same characters.  So it's spending a lot of time repartitioning the
mathches  between the possibilities of 1. and 2. and 3. in the above.
That's not polynomially bounded.  I'm not sure if it's even
exponentially bounded.

What I'm not sure is if there's a way of improving this without some
special case or, obviously, making the patterns more specific.

pws



Messages sorted by: Reverse Date, Date, Thread, Author