Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Prepend/append to the members of a list



=?iso-8859-1?Q?J=F6rg?= Sommer wrote:
> somewhere I've used
> 
> % ls foo bar bla
> 
> Then I've left the directory. Later I want to do
> 
> % ls !ls*:(s/^/dir/)
> 
> But it does not work, because the s/// expression doesn't know ^.
> 
> Or another example:
> 
> % cp dir/*.[1-9](:t:s+^+/usr/share/man/man?/+:s/$/.gz/) .
> 
> How can I make it? Is this possible?

It's not so difficult to add an option HIST_SUBST_PATTERN to allow
pattern matching.  However, it's zsh patterns, not regular expressions,
so (like substitution in parameter matching) you use # for head
and % for tail (and you can combine them as #% to anchor at both ends).

setopt histsubstpattern
ls !ls*:s+#+dir/+

What you wanted didn't need pattern matching, but to avoid
incompatibilities it did need a new option, so I extended it a bit
more generously while I was adding that.  See the documentation
for examples.  As always, tokenization and quoting is a bit hairier than
it deserves to be.

I can't see any good reason to use this in place of ${.../...}
substitution; it works in parameters just to be consistent.

I discovered a couple of bugs:  subst() was returning permanently
allocated strings, but this was only reflected in the caller in one
place, due to the usual doubts about the (undocumented) API.

Also, longest match substitution at the tail didn't handle null strings;
the use of :s/%/.../ showed up this up.  (It's "longest match" because
that's the default behaviour of pattern matches; obviously that shouldn't
affect the result here.)

Index: Completion/compinit
===================================================================
RCS file: /cvsroot/zsh/zsh/Completion/compinit,v
retrieving revision 1.16
diff -u -r1.16 compinit
--- Completion/compinit	28 Jun 2006 13:12:55 -0000	1.16
+++ Completion/compinit	31 Oct 2006 14:35:44 -0000
@@ -128,25 +128,26 @@
 # The standard options set in completion functions.
 
 _comp_options=(
-       extendedglob
        bareglobqual
+       extendedglob
        glob
        multibyte
        nullglob
        rcexpandparam
        unset
-    NO_markdirs
+    NO_allexport
+    NO_aliases
+    NO_cshnullglob
+    NO_errexit
     NO_globsubst
-    NO_shwordsplit
-    NO_shglob
+    NO_histsubstpattern
     NO_kshglob
     NO_ksharrays
     NO_kshtypeset
-    NO_cshnullglob
-    NO_allexport
-    NO_aliases
-    NO_errexit
+    NO_markdirs
     NO_octalzeroes
+    NO_shwordsplit
+    NO_shglob
     NO_warncreateglobal
 )
 
Index: Doc/Zsh/expn.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/expn.yo,v
retrieving revision 1.72
diff -u -r1.72 expn.yo
--- Doc/Zsh/expn.yo	13 Oct 2006 21:49:52 -0000	1.72
+++ Doc/Zsh/expn.yo	31 Oct 2006 14:35:46 -0000
@@ -258,7 +258,8 @@
 Substitute var(r) for var(l) as described below.
 The substitution is done only for the
 first string that matches var(l).  For arrays and for filename
-generation, this applies to each word of the expanded text.
+generation, this applies to each word of the expanded text.  See
+below for further notes on substitutions.
 
 The forms `tt(gs/)var(l)tt(/)var(r)' and `tt(s/)var(l)tt(/)var(r)tt(/:G)'
 perform global substitution, i.e. substitute every occurrence of var(r)
@@ -273,8 +274,8 @@
 )
 enditem()
 
-The tt(s/l/r/) substitution works as follows.  The left-hand side of
-substitutions are not regular expressions, but character strings.  Any
+The tt(s/l/r/) substitution works as follows.  By default the left-hand
+side of substitutions are not patterns, but character strings.  Any
 character can be used as the delimiter in place of `tt(/)'.  A
 backslash quotes the delimiter character.  The character `tt(&)', in
 the right-hand-side var(r), is replaced by the text from the
@@ -286,6 +287,40 @@
 Note the same record of the last var(l) and var(r) is maintained
 across all forms of expansion.
 
+If the option tt(HIST_SUBST_PATTERN) is set, var(l) is treated as
+a pattern of the usual form desribed in
+ifzman(the section FILENAME GENERATION below)\
+ifnzman(noderef(Filename Generation)).  This can be used in
+all the places where modifiers are available; note, however, that
+in globbing qualifiers parameter substitution has already taken place,
+so parameters in the replacement string should be quoted to ensure
+they are replaced at the correct time.
+Note also that complicated patterns used in globbing qualifiers may
+need the extended glob qualifier notation
+tt(LPAR()#q:s/)var(...)tt(/)var(...)tt(/RPAR()) in order for the
+shell to recognize the expression as a glob qualifer.  Further,
+note that bad patterns in the substitution are not subject to
+the tt(NO_BAD_PATTERN) option so will cause an error.
+
+When tt(HIST_SUBST_PATTERN) is set, var(l) may start with a tt(#)
+to indicate that the pattern must match at the start of the string
+to be substituted, and a tt(%) may appear at the start or after an tt(#)
+to indicate that the pattern must match at the end of the string
+to be substituted.
+
+For example, the following piece of filename generation code
+with the tt(EXTENDED_GLOB) option:
+
+example(print *.c+LPAR()#q:s/#%+LPAR()#b+RPAR()s+LPAR()*+RPAR().c/'S${match[1]}.C'/+RPAR())
+
+takes the expansion of tt(*.c) and applies the glob qualifiers in the
+tt(LPAR()#q)var(...)tt(RPAR()) expression, which consists of a substitution
+modifier anchored to the start and end of each word (tt(#%)).  This
+turns on backreferences (tt(LPAR()#b+RPAR())), so that the parenthesised
+subexpression is available in the replacement string as tt(${match[1]}).
+The replacement string is quoted so that the parameter is not substituted
+before the start of filename generation.
+
 The following tt(f), tt(F), tt(w) and tt(W) modifiers work only with
 parameter expansion and filename generation.  They are listed here to
 provide a single point of reference for all modifiers.
Index: Doc/Zsh/options.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/options.yo,v
retrieving revision 1.48
diff -u -r1.48 options.yo
--- Doc/Zsh/options.yo	25 Jul 2006 18:10:38 -0000	1.48
+++ Doc/Zsh/options.yo	31 Oct 2006 14:35:47 -0000
@@ -376,6 +376,15 @@
 filename generation.  Braces (and commas in between) do not become eligible
 for expansion.
 )
+pindex(HIST_SUBST_PATTERN)
+item(tt(HIST_SUBST_PATTERN))(
+Substitutions using the tt(:s) and tt(:&) history modifiers are performed
+with pattern matching instead of string matching.  This occurs wherever
+history modifiers are valid, including glob qualifiers and parameters.
+See
+ifzman(the section Modifiers in zmanref(zshexp))\
+ifnzman(noderef(Modifiers)).
+)
 pindex(IGNORE_BRACES)
 cindex(disabling brace expansion)
 cindex(brace expansion, disabling)
Index: Src/glob.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/glob.c,v
retrieving revision 1.53
diff -u -r1.53 glob.c
--- Src/glob.c	30 Jul 2006 18:00:37 -0000	1.53
+++ Src/glob.c	31 Oct 2006 14:35:50 -0000
@@ -2366,6 +2366,10 @@
 		}
 		umlen -= iincchar(&t);
 	    }
+	    if (pattrylen(p, s + l, 0, 0, ioff)) {
+		*sp = get_match_ret(*sp, l, l, fl, replstr, repllist);
+		return 1;
+	    }
 	    break;
 
 	case SUB_SUBSTR:
@@ -2566,7 +2570,7 @@
 
     /* munge the whole string: no match, so no replstr */
     *sp = get_match_ret(*sp, 0, 0, fl, 0, 0);
-    return 1;
+    return (fl & SUB_RETFAIL) ? 0 : 1;
 }
 
 /**/
Index: Src/hist.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/hist.c,v
retrieving revision 1.65
diff -u -r1.65 hist.c
--- Src/hist.c	28 Jun 2006 13:12:55 -0000	1.65
+++ Src/hist.c	31 Oct 2006 14:35:51 -0000
@@ -323,7 +323,8 @@
     if (strlen(ptr1)) {
 	zsfree(hsubl);
 	hsubl = ptr1;
-    }
+    } else if (!hsubl)		/* fail silently on this */
+	return 0;
     zsfree(hsubr);
     hsubr = ptr2;
     follow = ingetc();
@@ -337,11 +338,6 @@
 	}
     } else
 	inungetc(follow);
-    if (hsubl && !strstr(subline, hsubl)) {
-	herrflush();
-	zerr("substitution failed");
-	return 1;
-    }
     return 0;
 }
 
@@ -354,6 +350,15 @@
     return ehist->nwords ? ehist->nwords-1 : 0;
 }
 
+/**/
+static int
+substfailed(void)
+{
+    herrflush();
+    zerr("substitution failed");
+    return -1;
+}
+
 /* Perform history substitution, returning the next character afterwards. */
 
 /**/
@@ -376,10 +381,15 @@
 	isfirstch = 0;
 	inungetc(hatchar);
 	if (!(ehist = gethist(defev))
-	    || !(sline = getargs(ehist, 0, getargc(ehist)))
-	    || getsubsargs(sline, &gbal, &cflag) || !hsubl)
+	    || !(sline = getargs(ehist, 0, getargc(ehist))))
+	    return -1;
+
+	if (getsubsargs(sline, &gbal, &cflag))
+	    return substfailed();
+	if (!hsubl)
 	    return -1;
-	subst(&sline, hsubl, hsubr, gbal);
+	if (subst(&sline, hsubl, hsubr, gbal))
+	    return substfailed();
     } else {
 	/* Line doesn't begin ^foo^bar */
 	if (c != ' ')
@@ -608,9 +618,10 @@
 		if (getsubsargs(sline, &gbal, &cflag))
 		    return -1; /* fall through */
 	    case '&':
-		if (hsubl && hsubr)
-		    subst(&sline, hsubl, hsubr, gbal);
-		else {
+		if (hsubl && hsubr) {
+		    if (subst(&sline, hsubl, hsubr, gbal))
+			return substfailed();
+		} else {
 		    herrflush();
 		    zerr("no previous substitution");
 		    return -1;
@@ -1629,30 +1640,71 @@
     return str2;
 }
 
+
+/*
+ * Substitute "in" for "out" in "*strptr" and update "*strptr".
+ * If "gbal", do global substitution.
+ *
+ * This returns a result from the heap.  There seems to have
+ * been some confusion on this point.
+ */
+
 /**/
-void
+int
 subst(char **strptr, char *in, char *out, int gbal)
 {
-    char *str = *strptr, *instr = *strptr, *substcut, *sptr, *oldstr;
+    char *str = *strptr, *substcut, *sptr;
     int off, inlen, outlen;
 
     if (!*in)
 	in = str, gbal = 0;
-    if (!(substcut = (char *)strstr(str, in)))
-	return;
-    inlen = strlen(in);
-    sptr = convamps(out, in, inlen);
-    outlen = strlen(sptr);
 
-    do {
-	*substcut = '\0';
-	off = substcut - *strptr + outlen;
-	substcut += inlen;
-	*strptr = tricat(oldstr = *strptr, sptr, substcut);
-	if (oldstr != instr)
-	    zsfree(oldstr);
-	str = (char *)*strptr + off;
-    } while (gbal && (substcut = (char *)strstr(str, in)));
+    if (isset(HISTSUBSTPATTERN)) {
+	int fl = SUB_LONG|SUB_REST|SUB_RETFAIL;
+	char *oldin = in;
+	if (gbal)
+	    fl |= SUB_GLOBAL;
+	if (*in == '#' || *in == Pound) {
+	    /* anchor at head, no flag needed */
+	    in++;
+	}
+	if (*in == '%') {
+	    /* anchor at tail */
+	    in++;
+	    fl |= SUB_END;
+	}
+	if (in == oldin) {
+	    /* no anchor, substring match */
+	    fl |= SUB_SUBSTR;
+	}
+	if (in == str)
+	    in = dupstring(in);
+	if (parse_subst_string(in) || errflag)
+	    return 1;
+	if (parse_subst_string(out) || errflag)
+	    return 1;
+	singsub(&in);
+	if (getmatch(strptr, in, fl, 1, out))
+	    return 0;
+    } else {
+	if ((substcut = (char *)strstr(str, in))) {
+	    inlen = strlen(in);
+	    sptr = convamps(out, in, inlen);
+	    outlen = strlen(sptr);
+
+	    do {
+		*substcut = '\0';
+		off = substcut - *strptr + outlen;
+		substcut += inlen;
+		*strptr = zhtricat(*strptr, sptr, substcut);
+		str = (char *)*strptr + off;
+	    } while (gbal && (substcut = (char *)strstr(str, in)));
+
+	    return 0;
+	}
+    }
+
+    return 1;
 }
 
 /**/
Index: Src/options.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/options.c,v
retrieving revision 1.31
diff -u -r1.31 options.c
--- Src/options.c	22 Aug 2006 09:23:30 -0000	1.31
+++ Src/options.c	31 Oct 2006 14:35:51 -0000
@@ -137,6 +137,7 @@
 {{NULL, "histignorespace",    0},			 HISTIGNORESPACE},
 {{NULL, "histnofunctions",    0},			 HISTNOFUNCTIONS},
 {{NULL, "histnostore",	      0},			 HISTNOSTORE},
+{{NULL, "histsubstpattern",   OPT_EMULATE},              HISTSUBSTPATTERN},
 {{NULL, "histreduceblanks",   0},			 HISTREDUCEBLANKS},
 {{NULL, "histsavebycopy",     OPT_ALL},			 HISTSAVEBYCOPY},
 {{NULL, "histsavenodups",     0},			 HISTSAVENODUPS},
Index: Src/subst.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/subst.c,v
retrieving revision 1.64
diff -u -r1.64 subst.c
--- Src/subst.c	5 Oct 2006 21:53:26 -0000	1.64
+++ Src/subst.c	31 Oct 2006 14:35:52 -0000
@@ -2526,7 +2526,7 @@
             /* This once was executed only `if (qt) ...'. But with that
              * patterns in a expansion resulting from a ${(e)...} aren't
              * tokenized even though this function thinks they are (it thinks
-             * they are because subst_parse_str() turns Qstring tokens
+             * they are because parse_subst_str() turns Qstring tokens
              * into String tokens and for unquoted parameter expansions the
              * lexer normally does tokenize patterns inside parameter
              * expansions). */
@@ -3273,6 +3273,7 @@
 		break;
 
 	    case 's':
+		/* TODO: multibyte delimiter */
 		c = **ptr;
 		(*ptr)++;
 		ptr1 = *ptr;
@@ -3298,7 +3299,8 @@
 		for (tt = hsubl; *tt; tt++)
 		    if (inull(*tt) && *tt != Bnullkeep)
 			chuck(tt--);
-		untokenize(hsubl);
+		if (!isset(HISTSUBSTPATTERN))
+		    untokenize(hsubl);
 		for (tt = hsubr = ztrdup(ptr2); *tt; tt++)
 		    if (inull(*tt) && *tt != Bnullkeep)
 			chuck(tt--);
@@ -3444,15 +3446,8 @@
 		    *str = casemodify(*str, CASMOD_UPPER);
 		    break;
 		case 's':
-		    if (hsubl && hsubr) {
-			char *oldstr = *str;
-
+		    if (hsubl && hsubr)
 			subst(str, hsubl, hsubr, gbal);
-			if (*str != oldstr) {
-			    *str = dupstring(oldstr = *str);
-			    zsfree(oldstr);
-			}
-		    }
 		    break;
 		case 'q':
 		    *str = quotestring(*str, NULL, QT_BACKSLASH);
Index: Src/zsh.h
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/zsh.h,v
retrieving revision 1.101
diff -u -r1.101 zsh.h
--- Src/zsh.h	5 Oct 2006 21:53:27 -0000	1.101
+++ Src/zsh.h	31 Oct 2006 14:35:53 -0000
@@ -1405,6 +1405,7 @@
 #define SUB_ALL		0x0100	/* match complete string */
 #define SUB_GLOBAL	0x0200	/* global substitution ${..//all/these} */
 #define SUB_DOSUBST	0x0400	/* replacement string needs substituting */
+#define SUB_RETFAIL	0x0800  /* return status 0 if no match */
 
 /* Flags as the second argument to prefork */
 #define PF_TYPESET	0x01	/* argument handled like typeset foo=bar */
@@ -1631,6 +1632,7 @@
     HISTREDUCEBLANKS,
     HISTSAVEBYCOPY,
     HISTSAVENODUPS,
+    HISTSUBSTPATTERN,
     HISTVERIFY,
     HUP,
     IGNOREBRACES,
Index: Test/E01options.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/E01options.ztst,v
retrieving revision 1.16
diff -u -r1.16 E01options.ztst
--- Test/E01options.ztst	23 Sep 2006 06:55:29 -0000	1.16
+++ Test/E01options.ztst	31 Oct 2006 14:35:53 -0000
@@ -487,6 +487,20 @@
 >tmpcd tmpfile1 tmpfile2
 >tmp*
 
+  setopt histsubstpattern
+  print *(:s/t??/TING/)
+  foo=(tmp*)
+  print ${foo:s/??p/THUMP/}
+  foo=(one.c two.c three.c)
+  print ${foo:s/#%(#b)t(*).c/T${match[1]}.X/}
+  print *(#q:s/#(#b)tmp(*e)/'scrunchy${match[1]}'/)
+  unsetopt histsubstpattern
+0:HIST_SUBST_PATTERN option
+>TINGcd TINGfile1 TINGfile2
+>THUMPcd THUMPfile1 THUMPfile2
+>one.c Two.X Three.X
+>scrunchyfile1 scrunchyfile2 tmpcd
+
   setopt ignorebraces
   echo X{a,b}Y
   unsetopt ignorebraces

-- 
Peter Stephenson <pws@xxxxxxx>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php



Messages sorted by: Reverse Date, Date, Thread, Author