Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

PATCH: 3.1.5: bash ${.../old/new}



I wrote:
> Phil Pennock wrote:
> > * ${parameter/pattern/string} and ${parameter//pattern/string}
> >   pattern is expanded as per pathname expansion.  Longest match of
> >   pattern against parameter is replaced with string.  Once for / and for
> >   all instances with //.  #pattern anchors to beginning, %pattern
> >   anchors to end.  string may be null.  Applied to an array, this works
> >   on each element.
> >   zsh has a some of this with the colon-modifier 's'.
> 
> Maybe it can be done quite simply by upgrading the extra flags Sven
> added for # and % to match internal bits of a parameter's value.

This turns out to be correct.  It's quite difficult to do pattern
substitution otherwise --- you have to hack off the head and tail and
muck around inside --- and particularly multiple substitution, so I
think this is useful.

Actually, doing a single substitution was really easy, doing a global
one required a little more effort, particularly to avoid accumulated
partially substituted strings.  It seems to work.

Since this is zsh, and since it fits in with the existing code, you
can get special effects (described in the manual) without me needing
to do anything clever (the following don't use patterns but of course
it works for those):

% foo=wimbaweawimbawe
% print ${(I.2.)foo/w/z}             # second occurrence only
wimbazeawimbawe
% print ${(I.2.)foo//w/z}            # all occurrences from the second
wimbazeazimbaze
% print ${foo:/wimba/burble}         # has to match the whole string
wimbaweawimbawe
% print ${foo:/wimbaweawimbawe/burble}
burble

Arrays and elements thereof work as expected (thanks probably to
Zoli's last major surgery).

In fact, the internals are pretty much all there to be able to replace
the shortest match instead of the longest match for the pattern.  The
only thing missing is the syntax.  If somebody suggest some, I will
add it.  (This is easy basically because zsh does all this stuff in a
hugely inefficient way, simply by reducing the test string from either
end until a part of it matches.  If anybody wants to take a year off
and fix this...)

I hope this applies to a reasonably clean 3.1.5, but there isn't one
around in these parts.  At one stage it got mixed up with some of
Sven's Conddef's in zsh.h, but if you just expect a bit of offset
things should be OK.

(If you really want to know why the old files have the suffix .pmfl,
it's because I started off by altering all the parameter substitution
flags to symbols, a long overdue change.)

Oh, and I fixed some more else-dangleage in subst.c.

*** Doc/Zsh/expn.yo.pmfl	Tue Nov 10 10:10:01 1998
--- Doc/Zsh/expn.yo	Wed Dec  9 17:27:02 1998
***************
*** 398,403 ****
--- 398,421 ----
  the matched array elements are removed (use the tt((M)) flag to
  remove the non-matched elements).
  )
+ xitem(tt(${)var(name)tt(/)var(pattern)tt(/)var(repl)tt(}))
+ item(tt(${)var(name)tt(//)var(pattern)tt(/)var(repl)tt(}))(
+ Substitute the longest possible match of var(pattern) in the value of
+ variable var(name) with the string var(repl).  The first form
+ substitutes just the first occurrence, the second all occurrences.
+ The var(pattern) may begin with a var(#), in which case the
+ var(pattern) must match at the start of the string, or var(%), in
+ which case it must match at the end of the string.  The var(repl) may
+ be an empty string, in which case the final tt(/) may also be omitted.
+ To quote the final tt(/) in other cases it should be preceded by two
+ backslashes (i.e., a quoted backslash).  Substitution of an array is as
+ described for tt(#) and tt(%) above.
+ 
+ The first tt(/) may be preceded by a tt(:), in which case the match
+ will only succeed if it matches the entire word.  Note also the
+ effect of the tt(I) parameter expansion flag below:  the flags tt(S),
+ tt(M), tt(R), tt(B), tt(E) and tt(N) are not useful, however.
+ )
  item(tt(${#)var(spec)tt(}))(
  If var(spec) is one of the above substitutions, substitute
  the length in characters of the result instead of
***************
*** 553,558 ****
--- 571,580 ----
  )
  item(tt(I:)var(expr)tt(:))(
  Search the var(expr)th match (where var(expr) evaluates to a number).
+ This may be used with tt(${)...tt(/)...tt(}) or
+ tt(${)...tt(//)...tt(}) substitution:  in the first case, only the
+ var(expr)th match is substituted, while in the second case,  all
+ matches from the var(expr)th on are substituted.
  )
  item(tt(M))(
  Include the matched portion in the result.
*** Src/glob.c.pmfl	Wed Dec  9 11:25:29 1998
--- Src/glob.c	Wed Dec  9 17:53:36 1998
***************
*** 1806,1846 ****
  /* do the ${foo%%bar}, ${foo#bar} stuff */
  /* please do not laugh at this code. */
  
  /* Having found a match in getmatch, decide what part of string
   * to return.  The matched part starts b characters into string s
   * and finishes e characters in: 0 <= b <= e <= strlen(s)
   * (yes, empty matches should work).
!  * Bits 3 and higher in fl are used: the flags are
!  *   8:		Result is matched portion.
!  *  16:		Result is unmatched portion.
!  *		(N.B. this should be set for standard ${foo#bar} etc. matches.)
!  *  32:		Result is numeric position of start of matched portion.
!  *  64:		Result is numeric position of end of matched portion.
!  * 128:		Result is length of matched portion.
   */
  
  /**/
  static char *
! get_match_ret(char *s, int b, int e, int fl)
  {
      char buf[80], *r, *p, *rr;
      int ll = 0, l = strlen(s), bl = 0, t = 0, i;
  
!     if (fl & 8)			/* matched portion */
  	ll += 1 + (e - b);
!     if (fl & 16)		/* unmatched portion */
  	ll += 1 + (l - (e - b));
!     if (fl & 32) {
  	/* position of start of matched portion */
  	sprintf(buf, "%d ", b + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & 64) {
  	/* position of end of matched portion */
  	sprintf(buf + bl, "%d ", e + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & 128) {
  	/* length of matched portion */
  	sprintf(buf + bl, "%d ", e - b);
  	ll += (bl = strlen(buf));
--- 1806,1867 ----
  /* do the ${foo%%bar}, ${foo#bar} stuff */
  /* please do not laugh at this code. */
  
+ struct repldata {
+     int b, e;			/* beginning and end of chunk to replace */
+ };
+ typedef struct repldata *Repldata;
+ 
+ /* 
+  * List of bits of matches to concatenate with replacement string.
+  * The data is a struct repldata.  It is not used in cases like
+  * ${...//#foo/bar} even though SUB_GLOBAL is set, since the match
+  * is anchored.  It goes on the heap.
+  */
+ 
+ static LinkList repllist;
+ 
  /* Having found a match in getmatch, decide what part of string
   * to return.  The matched part starts b characters into string s
   * and finishes e characters in: 0 <= b <= e <= strlen(s)
   * (yes, empty matches should work).
!  * fl is a set of the SUB_* matches defined in zsh.h from SUB_MATCH onwards;
!  * the lower parts are ignored.
!  * replstr is the replacement string for a substitution
   */
  
  /**/
  static char *
! get_match_ret(char *s, int b, int e, int fl, char *replstr)
  {
      char buf[80], *r, *p, *rr;
      int ll = 0, l = strlen(s), bl = 0, t = 0, i;
  
!     if (replstr) {
! 	if ((fl & SUB_GLOBAL) && repllist) {
! 	    /* We are replacing the chunk, just add this to the list */
! 	    Repldata rd = (Repldata) halloc(sizeof(*rd));
! 	    rd->b = b;
! 	    rd->e = e;
! 	    addlinknode(repllist, rd);
! 	    return s;
! 	}
! 	ll += strlen(replstr);
!     }
!     if (fl & SUB_MATCH)			/* matched portion */
  	ll += 1 + (e - b);
!     if (fl & SUB_REST)		/* unmatched portion */
  	ll += 1 + (l - (e - b));
!     if (fl & SUB_BIND) {
  	/* position of start of matched portion */
  	sprintf(buf, "%d ", b + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & SUB_EIND) {
  	/* position of end of matched portion */
  	sprintf(buf + bl, "%d ", e + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & SUB_LEN) {
  	/* length of matched portion */
  	sprintf(buf + bl, "%d ", e - b);
  	ll += (bl = strlen(buf));
***************
*** 1850,1862 ****
  
      rr = r = (char *)ncalloc(ll);
  
!     if (fl & 8) {
  	/* copy matched portion to new buffer */
  	for (i = b, p = s + b; i < e; i++)
  	    *rr++ = *p++;
  	t = 1;
      }
!     if (fl & 16) {
  	/* Copy unmatched portion to buffer.  If both portions *
  	 * requested, put a space in between (why?)            */
  	if (t)
--- 1871,1883 ----
  
      rr = r = (char *)ncalloc(ll);
  
!     if (fl & SUB_MATCH) {
  	/* copy matched portion to new buffer */
  	for (i = b, p = s + b; i < e; i++)
  	    *rr++ = *p++;
  	t = 1;
      }
!     if (fl & SUB_REST) {
  	/* Copy unmatched portion to buffer.  If both portions *
  	 * requested, put a space in between (why?)            */
  	if (t)
***************
*** 1864,1869 ****
--- 1885,1893 ----
  	/* there may be unmatched bits at both beginning and end of string */
  	for (i = 0, p = s; i < b; i++)
  	    *rr++ = *p++;
+ 	if (replstr)
+ 	    for (p = replstr; *p; )
+ 		*rr++ = *p++;
  	for (i = e, p = s + e; i < l; i++)
  	    *rr++ = *p++;
  	t = 1;
***************
*** 1879,1920 ****
      return r;
  }
  
! /* It is called from paramsubst to get the match for ${foo#bar} etc.
!  * Bits of fl determines the required action:
!  *   bit 0: match the end instead of the beginning (% or %%)
!  *   bit 1: % or # was doubled so get the longest match
!  *   bit 2: substring match
!  *   bit 3: include the matched portion
!  *   bit 4: include the unmatched portion
!  *   bit 5: the index of the beginning
!  *   bit 6: the index of the end
!  *   bit 7: the length of the match
!  *   bit 8: match the complete string
   * *sp points to the string we have to modify. The n'th match will be
   * returned in *sp. ncalloc is used to get memory for the result string.
   */
  
  /**/
  int
! getmatch(char **sp, char *pat, int fl, int n)
  {
      Comp c;
!     char *s = *sp, *t, sav;
!     int i, j, l = strlen(*sp);
  
      c = parsereg(pat);
      if (!c) {
  	zerr("bad pattern: %s", pat, 0);
  	return 1;
      }
!     if (fl & 256) {
  	i = domatch(s, c, 0);
! 	*sp = get_match_ret(*sp, 0, domatch(s, c, 0) ? l : 0, fl);
! 	if (! **sp && (((fl & 8) && !i) || ((fl & 16) && i)))
  	    return 0;
  	return 1;
      }
!     switch (fl & 7) {
      case 0:
  	/* Smallest possible match at head of string:    *
  	 * start adding characters until we get a match. */
--- 1903,1940 ----
      return r;
  }
  
! /*
!  * This is called from paramsubst to get the match for ${foo#bar} etc.
!  * fl is a set of the SUB_* flags defined in zsh.h
   * *sp points to the string we have to modify. The n'th match will be
   * returned in *sp. ncalloc is used to get memory for the result string.
+  * replstr is the replacement string from a ${.../orig/repl}, in
+  * which case pat is the original.
   */
  
  /**/
  int
! getmatch(char **sp, char *pat, int fl, int n, char *replstr)
  {
      Comp c;
!     char *s = *sp, *t, *start, sav;
!     int i, j, l = strlen(*sp), lleft, matched;
  
+     MUSTUSEHEAP("getmatch");	/* presumably covered by prefork() test */
+     repllist = NULL;
      c = parsereg(pat);
      if (!c) {
  	zerr("bad pattern: %s", pat, 0);
  	return 1;
      }
!     if (fl & SUB_ALL) {
  	i = domatch(s, c, 0);
! 	*sp = get_match_ret(*sp, 0, i ? l : 0, fl, i ? replstr : 0);
! 	if (! **sp && (((fl & SUB_MATCH) && !i) || ((fl & SUB_REST) && i)))
  	    return 0;
  	return 1;
      }
!     switch (fl & (SUB_END|SUB_LONG|SUB_SUBSTR)) {
      case 0:
  	/* Smallest possible match at head of string:    *
  	 * start adding characters until we get a match. */
***************
*** 1923,1929 ****
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, i, fl);
  		return 1;
  	    }
  	    if ((*t = sav) == Meta)
--- 1943,1949 ----
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, i, fl, replstr);
  		return 1;
  	    }
  	    if ((*t = sav) == Meta)
***************
*** 1931,1942 ****
  	}
  	break;
  
!     case 1:
  	/* Smallest possible match at tail of string:  *
  	 * move back down string until we get a match. */
  	for (t = s + l; t >= s; t--) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, t - s, l, fl);
  		return 1;
  	    }
  	    if (t > s+1 && t[-2] == Meta)
--- 1951,1962 ----
  	}
  	break;
  
!     case SUB_END:
  	/* Smallest possible match at tail of string:  *
  	 * move back down string until we get a match. */
  	for (t = s + l; t >= s; t--) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, t - s, l, fl, replstr);
  		return 1;
  	    }
  	    if (t > s+1 && t[-2] == Meta)
***************
*** 1944,1950 ****
  	}
  	break;
  
!     case 2:
  	/* Largest possible match at head of string:        *
  	 * delete characters from end until we get a match. */
  	for (t = s + l; t > s; t--) {
--- 1964,1970 ----
  	}
  	break;
  
!     case SUB_LONG:
  	/* Largest possible match at head of string:        *
  	 * delete characters from end until we get a match. */
  	for (t = s + l; t > s; t--) {
***************
*** 1952,1958 ****
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, t - s, fl);
  		return 1;
  	    }
  	    *t = sav;
--- 1972,1978 ----
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, t - s, fl, replstr);
  		return 1;
  	    }
  	    *t = sav;
***************
*** 1961,1972 ****
  	}
  	break;
  
!     case 3:
  	/* Largest possible match at tail of string:       *
  	 * move forward along string until we get a match. */
  	for (i = 0, t = s; i < l; i++, t++) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, i, l, fl);
  		return 1;
  	    }
  	    if (*t == Meta)
--- 1981,1992 ----
  	}
  	break;
  
!     case (SUB_END|SUB_LONG):
  	/* Largest possible match at tail of string:       *
  	 * move forward along string until we get a match. */
  	for (i = 0, t = s; i < l; i++, t++) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, i, l, fl, replstr);
  		return 1;
  	    }
  	    if (*t == Meta)
***************
*** 1974,1983 ****
  	}
  	break;
  
!     case 4:
  	/* Smallest at start, but matching substrings. */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl);
  	    return 1;
  	}
  	for (i = 1; i <= l; i++) {
--- 1994,2003 ----
  	}
  	break;
  
!     case SUB_SUBSTR:
  	/* Smallest at start, but matching substrings. */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl, replstr);
  	    return 1;
  	}
  	for (i = 1; i <= l; i++) {
***************
*** 1986,1992 ****
  		s[j] = '\0';
  		if (domatch(t, c, 0) && !--n) {
  		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl);
  		    return 1;
  		}
  		if ((s[j] = sav) == Meta)
--- 2006,2012 ----
  		s[j] = '\0';
  		if (domatch(t, c, 0) && !--n) {
  		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl, replstr);
  		    return 1;
  		}
  		if ((s[j] = sav) == Meta)
***************
*** 1999,2008 ****
  	}
  	break;
  
!     case 5:
  	/* Smallest at end, matching substrings */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl);
  	    return 1;
  	}
  	for (i = l; i--;) {
--- 2019,2028 ----
  	}
  	break;
  
!     case (SUB_END|SUB_SUBSTR):
  	/* Smallest at end, matching substrings */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl, replstr);
  	    return 1;
  	}
  	for (i = l; i--;) {
***************
*** 2013,2019 ****
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl);
  		    return 1;
  		}
  		*t = sav;
--- 2033,2039 ----
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl, replstr);
  		    return 1;
  		}
  		*t = sav;
***************
*** 2025,2056 ****
  	}
  	break;
  
!     case 6:
  	/* Largest at start, matching substrings. */
! 	for (i = l; i; i--) {
! 	    for (t = s, j = i; j <= l; j++, t++) {
! 		sav = s[j];
! 		s[j] = '\0';
! 		if (domatch(t, c, 0) && !--n) {
! 		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl);
! 		    return 1;
  		}
! 		if ((s[j] = sav) == Meta)
! 		    j++;
! 		if (*t == Meta)
! 		    t++;
  	    }
! 	    if (i >= 2 && s[i-2] == Meta)
! 		i--;
! 	}
! 	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl);
  	    return 1;
  	}
  	break;
  
!     case 7:
  	/* Largest at end, matching substrings. */
  	for (i = 0; i < l; i++) {
  	    for (t = s + l, j = i; j >= 0; j--, t--) {
--- 2045,2098 ----
  	}
  	break;
  
!     case (SUB_LONG|SUB_SUBSTR):
  	/* Largest at start, matching substrings. */
! 	start = s;
! 	lleft = l;
! 	if (fl & SUB_GLOBAL)
! 	    repllist = newlinklist();
! 	do {
! 	    /* loop over all matches for global substitution */
! 	    matched = 0;
! 	    for (i = lleft; i; i--) {
! 		for (t = start, j = i; j <= lleft; j++, t++) {
! 		    sav = start[j];
! 		    start[j] = '\0';
! 		    if (domatch(t, c, 0) &&
! 			(!--n || ((fl & SUB_GLOBAL) && n <= 0))) {
! 			start[j] = sav;
! 			*sp = get_match_ret(*sp, t - s, j + (start-s), fl,
! 					    replstr);
! 			if (!(fl & SUB_GLOBAL))
! 			    return 1;
! 			matched = j;
! 			start += j;
! 			lleft -= j;
! 			break;
! 		    }
! 		    if ((start[j] = sav) == Meta)
! 			j++;
! 		    if (*t == Meta)
! 			t++;
  		}
! 		if (matched)
! 		    break;
! 		if (i >= 2 && s[i-2] == Meta)
! 		    i--;
  	    }
! 	} while (matched);
! 	/*
! 	 * check if we can match a blank string, if so do it
! 	 * at the start.  Goodness knows if this is a good idea
! 	 * with global substitution, so it doesn't happen.
! 	 */
! 	if (!(fl & SUB_GLOBAL) && domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl, replstr);
  	    return 1;
  	}
  	break;
  
!     case (SUB_END|SUB_LONG|SUB_SUBSTR):
  	/* Largest at end, matching substrings. */
  	for (i = 0; i < l; i++) {
  	    for (t = s + l, j = i; j >= 0; j--, t--) {
***************
*** 2058,2064 ****
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl);
  		    return 1;
  		}
  		*t = sav;
--- 2100,2106 ----
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl, replstr);
  		    return 1;
  		}
  		*t = sav;
***************
*** 2071,2083 ****
  		i++;
  	}
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl);
  	    return 1;
  	}
  	break;
      }
!     /* munge the whole string */
!     *sp = get_match_ret(*sp, 0, 0, fl);
      return 1;
  }
  
--- 2113,2158 ----
  		i++;
  	}
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl, replstr);
  	    return 1;
  	}
  	break;
      }
! 
!     if (repllist && nonempty(repllist)) {
! 	/* Put all the bits of a global search and replace together. */
! 	LinkNode nd;
! 	Repldata rd;
! 	int rlen;
! 
! 	lleft = 0;		/* size of returned string */
! 	i = 0;			/* start of last chunk we got from *sp */
! 	rlen = strlen(replstr);
! 	for (nd = firstnode(repllist); nd; incnode(nd)) {
! 	    rd = (Repldata) getdata(nd);
! 	    lleft += rd->b - i; /* previous chunk of *sp */
! 	    lleft += rlen;	/* the replaced bit */
! 	    i = rd->e;		/* start of next chunk of *sp */
! 	}
! 	lleft += l - i;	/* final chunk from *sp */
! 	start = t = halloc(lleft+1);
! 	i = 0;
! 	for (nd = firstnode(repllist); nd; incnode(nd)) {
! 	    rd = (Repldata) getdata(nd);
! 	    memcpy(t, s + i, rd->b - i);
! 	    t += rd->b - i;
! 	    memcpy(t, replstr, rlen);
! 	    t += rlen;
! 	    i = rd->e;
! 	}
! 	memcpy(t, s + i, l - i);
! 	start[lleft] = '\0';
! 	*sp = start;
! 	return 1;
!     }
! 
!     /* munge the whole string: no match, so no replstr */
!     *sp = get_match_ret(*sp, 0, 0, fl, 0);
      return 1;
  }
  
*** Src/subst.c.pmfl	Wed Dec  9 11:25:29 1998
--- Src/subst.c	Wed Dec  9 17:18:19 1998
***************
*** 99,105 ****
      char *str  = str3;
  
      while (!errflag && *str) {
! 	if ((qt = *str == Qstring) || *str == String)
  	    if (str[1] == Inpar) {
  		str++;
  		goto comsub;
--- 99,105 ----
      char *str  = str3;
  
      while (!errflag && *str) {
! 	if ((qt = *str == Qstring) || *str == String) {
  	    if (str[1] == Inpar) {
  		str++;
  		goto comsub;
***************
*** 125,131 ****
  		str3 = (char *)getdata(node);
  		continue;
  	    }
! 	else if ((qt = *str == Qtick) || *str == Tick)
  	  comsub: {
  	    LinkList pl;
  	    char *s, *str2 = str;
--- 125,131 ----
  		str3 = (char *)getdata(node);
  		continue;
  	    }
! 	} else if ((qt = *str == Qtick) || *str == Tick)
  	  comsub: {
  	    LinkList pl;
  	    char *s, *str2 = str;
***************
*** 135,142 ****
  	    if (*str == Inpar) {
  		endchar = Outpar;
  		str[-1] = '\0';
  		if (skipparens(Inpar, Outpar, &str))
! 		    DPUTS(1, "BUG: parse error in command substitution");
  		str--;
  	    } else {
  		endchar = *str;
--- 135,146 ----
  	    if (*str == Inpar) {
  		endchar = Outpar;
  		str[-1] = '\0';
+ #ifdef DEBUG
  		if (skipparens(Inpar, Outpar, &str))
! 		    dputs("BUG: parse error in command substitution");
! #else
! 		skipparens(Inpar, Outpar, &str);
! #endif
  		str--;
  	    } else {
  		endchar = *str;
***************
*** 298,304 ****
      if (!assign)
  	return;
  
!     if (assign < 3)
  	if ((*namptr)[1] && (sub = strchr(*namptr + 1, Equals))) {
  	    if (assign == 1)
  		for (ptr = *namptr; ptr != sub; ptr++)
--- 302,308 ----
      if (!assign)
  	return;
  
!     if (assign < 3) {
  	if ((*namptr)[1] && (sub = strchr(*namptr + 1, Equals))) {
  	    if (assign == 1)
  		for (ptr = *namptr; ptr != sub; ptr++)
***************
*** 311,316 ****
--- 315,321 ----
  	    }
  	} else
  	    return;
+     }
  
      ptr = *namptr;
      while ((sub = strchr(ptr, ':'))) {
***************
*** 691,697 ****
      char *aptr = *str;
      char *s = aptr, *fstr, *idbeg, *idend, *ostr = (char *) getdata(n);
      int colf;			/* != 0 means we found a colon after the name */
-     int doub = 0;		/* != 0 means we have %%, not %, or ##, not # */
      int isarr = 0;
      int plan9 = isset(RCEXPANDPARAM);
      int globsubst = isset(GLOBSUBST);
--- 696,701 ----
***************
*** 705,715 ****
      Value v;
      int flags = 0;
      int flnum = 0;
-     int substr = 0;
      int sortit = 0, casind = 0;
      int casmod = 0;
      char *sep = NULL, *spsep = NULL;
      char *premul = NULL, *postmul = NULL, *preone = NULL, *postone = NULL;
      long prenum = 0, postnum = 0;
      int copied = 0;
      int arrasg = 0;
--- 709,719 ----
      Value v;
      int flags = 0;
      int flnum = 0;
      int sortit = 0, casind = 0;
      int casmod = 0;
      char *sep = NULL, *spsep = NULL;
      char *premul = NULL, *postmul = NULL, *preone = NULL, *postone = NULL;
+     char *replstr = NULL;	/* replacement string for /orig/repl */
      long prenum = 0, postnum = 0;
      int copied = 0;
      int arrasg = 0;
***************
*** 764,785 ****
  		    nojoin = 1;
  		    break;
  		case 'M':
! 		    flags |= 8;
  		    break;
  		case 'R':
! 		    flags |= 16;
  		    break;
  		case 'B':
! 		    flags |= 32;
  		    break;
  		case 'E':
! 		    flags |= 64;
  		    break;
  		case 'N':
! 		    flags |= 128;
  		    break;
  		case 'S':
! 		    substr = 1;
  		    break;
  		case 'I':
  		    flnum = get_intarg(&s);
--- 768,789 ----
  		    nojoin = 1;
  		    break;
  		case 'M':
! 		    flags |= SUB_MATCH;
  		    break;
  		case 'R':
! 		    flags |= SUB_REST;
  		    break;
  		case 'B':
! 		    flags |= SUB_BIND;
  		    break;
  		case 'E':
! 		    flags |= SUB_EIND;
  		    break;
  		case 'N':
! 		    flags |= SUB_LEN;
  		    break;
  		case 'S':
! 		    flags |= SUB_SUBSTR;
  		    break;
  		case 'I':
  		    flnum = get_intarg(&s);
***************
*** 940,946 ****
  		s++;
  	    } else
  		globsubst = 1;
! 	} else if (*s == '+')
  	    if (iident(s[1]))
  		chkset = 1, s++;
  	    else if (!inbrace) {
--- 944,950 ----
  		s++;
  	    } else
  		globsubst = 1;
! 	} else if (*s == '+') {
  	    if (iident(s[1]))
  		chkset = 1, s++;
  	    else if (!inbrace) {
***************
*** 951,957 ****
  		zerr("bad substitution", NULL, 0);
  		return NULL;
  	    }
! 	else
  	    break;
      }
      globsubst = globsubst && !qt;
--- 955,961 ----
  		zerr("bad substitution", NULL, 0);
  		return NULL;
  	    }
! 	} else
  	    break;
      }
      globsubst = globsubst && !qt;
***************
*** 1124,1146 ****
  		    *s == '=' || *s == Equals ||
  		    *s == '%' ||
  		    *s == '#' || *s == Pound ||
! 		    *s == '?' || *s == Quest)) {
  
  	if (!flnum)
  	    flnum++;
  	if (*s == '%')
! 	    flags |= 1;
  
  	/* Check for ${..%%..} or ${..##..} */
  	if ((*s == '%' || *s == '#' || *s == Pound) && *s == s[1]) {
  	    s++;
! 	    doub = 1;
  	}
  	s++;
  
! 	flags |= (doub << 1) | (substr << 2) | (colf << 8);
! 	if (!(flags & 0xf8))
! 	    flags |= 16;
  
  	if (colf && !vunset)
  	    vunset = (isarr) ? !*aval : !*val || (*val == Nularg && !val[1]);
--- 1128,1195 ----
  		    *s == '=' || *s == Equals ||
  		    *s == '%' ||
  		    *s == '#' || *s == Pound ||
! 		    *s == '?' || *s == Quest ||
! 		    *s == '/')) {
  
  	if (!flnum)
  	    flnum++;
  	if (*s == '%')
! 	    flags |= SUB_END;
  
  	/* Check for ${..%%..} or ${..##..} */
  	if ((*s == '%' || *s == '#' || *s == Pound) && *s == s[1]) {
  	    s++;
! 	    /* we have %%, not %, or ##, not # */
! 	    flags |= SUB_LONG;
  	}
  	s++;
+ 	if (s[-1] == '/') {
+ 	    char *ptr;
+ 	    /* previous flags are irrelevant: we always want longest match */
+ 	    flags = SUB_LONG;
+ 	    if (*s == '/') {
+ 		/* doubled, so replace all occurrences */
+ 		flags |= SUB_GLOBAL;
+ 		s++;
+ 	    }
+ 	    /* Check for anchored substitution */
+ 	    if (*s == '%') {
+ 		/* anchor at tail */
+ 		flags |= SUB_END;
+ 		s++;
+ 	    } else if (*s == '#' || *s == Pound) {
+ 		/* anchor at head: this is the `normal' case in getmatch */
+ 		s++;
+ 	    } else
+ 		flags |= SUB_SUBSTR;
+ 	    /*
+ 	     * Find the / marking the end of the search pattern.
+ 	     * If there isn't one, we're just going to delete that,
+ 	     * i.e. replace it with an empty string.
+ 	     *
+ 	     * This allows quotation of the slash with '\\/'. Why
+ 	     * two?  Well, for a non-quoted string we can check for
+ 	     * Bnull+/, which is what you get from `\/', but inside
+ 	     * double quotes the Bnull isn't there, so it's not
+ 	     * consistent.
+ 	     */
+ 	    for (ptr = s; *ptr && *ptr != '/'; ptr++)
+ 		if (*ptr == '\\' && ptr[1] == '/')
+ 		    chuck(ptr);
+ 	    replstr = (*ptr && ptr[1]) ? ptr+1 : "";
+ 	    untokenize(replstr);
+ 	    *ptr = '\0';
+ 	}
  
! 	if (colf)
! 	    flags |= SUB_ALL;
! 	/*
! 	 * With no special flags, i.e. just a # or % or whatever,
! 	 * the matched portion is removed and we keep the rest.
! 	 * We also want the rest when we're doing a substitution.
! 	 */
! 	if (!(flags & (SUB_MATCH|SUB_REST|SUB_BIND|SUB_EIND|SUB_LEN)))
! 	    flags |= SUB_REST;
  
  	if (colf && !vunset)
  	    vunset = (isarr) ? !*aval : !*val || (*val == Nularg && !val[1]);
***************
*** 1234,1239 ****
--- 1283,1289 ----
  	case '%':
  	case '#':
  	case Pound:
+ 	case '/':
  	    if (qt)
  		if (parse_subst_string(s)) {
  		    zerr("parse error in ${...%c...} substitution",
***************
*** 1247,1260 ****
  		char **pp = aval = (char **)ncalloc(sizeof(char *) * (arrlen(aval) + 1));
  
  		while ((*pp = *ap++)) {
! 		    if (getmatch(pp, s, flags, flnum))
  			pp++;
  		}
  		copied = 1;
  	    } else {
  		if (vunset)
  		    val = dupstring("");
! 		getmatch(&val, s, flags, flnum);
  		copied = 1;
  	    }
  	    break;
--- 1297,1310 ----
  		char **pp = aval = (char **)ncalloc(sizeof(char *) * (arrlen(aval) + 1));
  
  		while ((*pp = *ap++)) {
! 		    if (getmatch(pp, s, flags, flnum, replstr))
  			pp++;
  		}
  		copied = 1;
  	    } else {
  		if (vunset)
  		    val = dupstring("");
! 		getmatch(&val, s, flags, flnum, replstr);
  		copied = 1;
  	    }
  	    break;
*** Src/zsh.h.pmfl	Wed Dec  9 11:25:29 1998
--- Src/zsh.h	Wed Dec  9 14:57:39 1998
***************
*** 890,895 ****
--- 892,914 ----
  #define PM_DONTIMPORT	(1<<12)	/* do not import this variable                */
  #define PM_RESTRICTED	(1<<13) /* cannot be changed in restricted mode       */
  #define PM_UNSET	(1<<14)	/* has null value                             */
+ 
+ /*
+  * Flags for doing matches inside parameter substitutions, i.e.
+  * ${...#...} and friends.  This could be an enum, but so
+  * could a lot of other things.
+  */
+ 
+ #define SUB_END		0x0001	/* match end instead of begining, % or %%  */
+ #define SUB_LONG	0x0002	/* % or # doubled, get longest match */
+ #define SUB_SUBSTR	0x0004	/* match a substring */
+ #define SUB_MATCH	0x0008	/* include the matched portion */
+ #define SUB_REST	0x0010	/* include the unmatched portion */
+ #define SUB_BIND	0x0020	/* index of beginning of string */
+ #define SUB_EIND	0x0040	/* index of end of string */
+ #define SUB_LEN		0x0080	/* length of match */
+ #define SUB_ALL		0x0100	/* match complete string */
+ #define SUB_GLOBAL	0x0200	/* global substitution ${..//all/these} */
  
  /* node for named directory hash table (nameddirtab) */
  
  
-- 
Peter Stephenson <pws@xxxxxxxxxxxxxxxxx>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy



Messages sorted by: Reverse Date, Date, Thread, Author