Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

PATCH: ${foo#* }



Sven Wischnowsky wrote:
> I only wanted to improve the completion for `_a2ps' and then
> discovered that matching as in `${foo#* }' is extremly slow if `foo'
> contains lots of spaces. So much so, that I had to change
> `_long_options' to make the changed `_a2ps' usable.
> 
> Try:
> 
>   % a=''
>   % repeat 500 a="$a a"
>   % echo "${a#* }"
> 
> If I remember correctly, Peter said something about performance when
> cleaning up `getmatch()'.

There were two ways of doing this; I picked the one assuming that the
quickest way was to start from the longest match and shorten, which in this
case is obviously wrong.

The adaption below is still a compromise.  The reason is that the way the
pattern matcher works makes it easy to get the longest match, but you
always have to get the shortest by brute force; so if the pattern doesn't
match at all, it can be slow to get the shortest match, since it has to try
the entire string (though this is not a problem in the example here).  In
this version, the longest match is still always found first, but the
shortest is then found by gradually lengthening the string towards that.
Maybe this will keep everyone happy.  It certainly ought to be OK for
spaces in normal text, where they occur frequently.

--- Src/glob.c.short	Mon Mar  1 10:45:43 1999
+++ Src/glob.c	Wed Mar 10 18:15:24 1999
@@ -2304,26 +2304,25 @@
     case SUB_LONG:
 	/*
 	 * Largest/smallest possible match at head of string.
-	 * First get the longest match.
+	 * First get the longest match...
 	 */
 	if (dolongestmatch(s, c, 0)) {
 	    char *mpos = pptr;
-	    while (!(fl & SUB_LONG) && pptr > s) {
-		/*
-		 * If we want the shortest, keep backing up to the
-		 * previous character and find the longest up to there.
-		 * That way we can usually reach the shortest in only
-		 * a few attempts.
-		 */
-		t = (pptr > s + 1 && pptr[-2] == Meta) ? pptr - 2 : pptr -1;
+	    if (!(fl & SUB_LONG)) {
+	      /*
+	       * ... now we know whether it's worth looking for the
+	       * shortest, which we do by brute force.
+	       */
+	      for (t = s; t < mpos; METAINC(t)) {
 		sav = *t;
 		*t = '\0';
-		if (!dolongestmatch(s, c, 0)) {
-		    *t = sav;
-		    break;
+		if (dolongestmatch(s, c, 0)) {
+		  mpos = pptr;
+		  *t = sav;
+		  break;
 		}
-		mpos = pptr;
 		*t = sav;
+	      }
 	    }
 	    *sp = get_match_ret(*sp, 0, mpos-s, fl, replstr);
 	    return 1;

-- 
Peter Stephenson <pws@xxxxxxxxxxxxxxxxx>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy



Messages sorted by: Reverse Date, Date, Thread, Author