Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

PATCH: specifying arguments in printf formats



This patch now adds the feature I mentioned before where arguments can
be specified with conversion specifications like '%1$*2$.*3$d'  instead
of using arguments in order.

You shouldn't mix the two styles of specifying arguments - I recommend
against it in the manual which is the same as the case with printf(3).
If you do mix them, it will work but I may change the exact semantics
if it allows me to improve the code.

Peter Stephenson wrote:
> 
> Oliver Kiddle wrote:
> > The question is how should this interact with the printf(1) feature of
> > reusing the format if more arguments remain. The easy answer would be
> > to not reuse the format if this feature had been used. As an
> > experiment, I've made it remove all arguments up to the last one used.
> > This allows interesting things like:
> >
> > % printf '%2$s %1$s ' 1 2 3 4 5 6 ;echo
> > 2 1 4 3 6 5
> >
> > I can see this having some uses but I can also see it being a problem
> > as this is likely to be used for picking out fields where the arguments
> > are some command in $(...).
> 
> Even in that case, the problem is really with the reuse of the format,
> rather than the special argument-picking syntax.  Maybe it would be best to
> have a command-line option to turn it (the reuse of the format specifier,
> that is) off --- or even on, since it might be regarded as a little florid
> for default behaviour.  But I suppose we're going to have to stick with ksh
> if we're trying to match it.

Reuse of arguments is defined in POSIX so it isn't ksh I'm matching
there. And, this new argument specifying feature is not in ksh.

I've decided that an option as Peter suggests is the best way to go
here. It'll have to turn reuse off so that we are keeping to POSIX/ksh
which is a slight pity as the opposite would perhaps be better.

Any good suggestions on the choice of option letter? -r and -R (for
reuse) are both gone but we want to indicate the opposite of that
anyway.

For the moment, you can use -r but I've not documented that and will
change it later. For ksh compatibility, -r can't be used with -f but
I'll probably suggest that be changed on the shell list.

Oliver

Index: Doc/Zsh/builtins.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/builtins.yo,v
retrieving revision 1.39
diff -u -r1.39 builtins.yo
--- Doc/Zsh/builtins.yo	2001/10/15 11:34:27	1.39
+++ Doc/Zsh/builtins.yo	2001/10/18 14:11:41
@@ -725,9 +725,9 @@
 item(tt(printf) var(format) [ var(arg) ... ])(
 Print the arguments according to the format specification. Formatting
 rules are the same as used in C. The same escape sequences as for tt(echo)
-are recognised in the format. All C format specifications ending in one of
-csdiouxXeEfgGn are handled. In addition to this, `tt(%b)' can be used
-instead of `tt(%s)' to cause escape sequences in the argument to be
+are recognised in the format. All C conversion specifications ending in
+one of csdiouxXeEfgGn are handled. In addition to this, `tt(%b)' can be
+used instead of `tt(%s)' to cause escape sequences in the argument to be
 recognised and `tt(%q)' can be used to quote the argument in such a way
 that allows it to be reused as shell input. With the numeric format
 specifiers, if the corresponding argument starts with a quote character,
@@ -736,6 +736,13 @@
 noderef(Arithmetic Evaluation) for a description of arithmetic
 expressions. With `tt(%n)', the corresponding argument is taken as an
 identifier which is created as an integer parameter.
+
+Normally, conversion specifications are applied to each argument in order
+but they can explicitly specify the var(n)th argument is to be used by
+replacing `tt(%)' by `tt(%)var(n)tt($)' and `tt(*)' by `tt(*)var(n)tt($)'.
+It is recommended that you do not mix references of this explicit style
+with the normal style and the handling of such mixed styles may be subject
+to future change.
 
 If arguments remain unused after formatting, the format string is reused
 until all arguments have been consumed. If more arguments are required by
Index: Src/builtin.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v
retrieving revision 1.59
diff -u -r1.59 builtin.c
--- Src/builtin.c	2001/10/16 11:16:11	1.59
+++ Src/builtin.c	2001/10/18 14:11:41
@@ -2892,10 +2892,11 @@
 int
 bin_print(char *name, char **args, char *ops, int func)
 {
-    int flen, width, prec, type, argc, n, nnl = 0, ret = 0;
+    int flen, width, prec, type, argc, n, narg;
+    int nnl = 0, ret = 0, maxarg = 0;
     int flags[5], *len;
     char *start, *endptr, *c, *d, *flag, spec[11], *fmt = NULL;
-    char **first, *flagch = "0+- #", save, nullstr = '\0';
+    char **first, *curarg, *flagch = "0+- #", save, nullstr = '\0';
     zlong count = 0;
     FILE *fout = stdout;
 
@@ -3095,6 +3096,11 @@
     /* printf style output */
     *spec='%';
     do {
+    	if (maxarg) {
+	    first += maxarg;
+	    argc -= maxarg;
+    	    maxarg = 0;
+	}
 	for (c = fmt;c-fmt < flen;c++) {
 	    if (*c != '%') {
 		putc(*c, fout);
@@ -3111,11 +3117,29 @@
 
 	    type = prec = -1;
 	    width = 0;
+	    curarg = NULL;
 	    d = spec + 1;
 
+	    if (*c >= '1' && *c <= '9') {
+	    	narg = strtoul(c, &endptr, 0);
+		if (*endptr == '$') {
+		    c = endptr + 1;
+		    DPUTS(narg <= 0, "specified zero or negative arg");
+		    if (narg > argc) {
+		    	zwarnnam(name, "%d: argument specifier out of range",
+			    0, narg);
+			return 1;
+		    } else {
+		    	if (narg > maxarg) maxarg = narg;
+		    	curarg = *(first + narg - 1);
+		    }
+		}
+	    }
+		    
+	    
 	    /* copy only one of each flag as spec has finite size */
 	    memset(flags, 0, sizeof(flags));
-	    while (flag = strchr(flagch, *c)) {
+	    while ((flag = strchr(flagch, *c))) {
 	    	if (!flags[flag - flagch]) {
 	    	    flags[flag - flagch] = 1;
 		    *d++ = *c;
@@ -3123,28 +3147,60 @@
 	    	c++;
 	    }
 
-	    if (*c == '*') {
-		if (*args) width = (int)mathevali(*args++);
-		if (errflag) {
-	    	    errflag = 0;
-		    ret = 1;
-		}
-		c++;
-	    } else if (idigit(*c)) {
+	    if (idigit(*c)) {
 		width = strtoul(c, &endptr, 0);
 		c = endptr;
+	    } else if (*c == '*') {
+		if (idigit(*++c)) {
+		    narg = strtoul(c, &endptr, 0);
+		    if (*endptr == '$') {
+		    	c = endptr + 1;
+			if (narg > argc || narg <= 0) {
+		    	    zwarnnam(name,
+			    	"%d: argument specifier out of range",
+				0, narg);
+			    return 1;
+			} else {
+		    	    if (narg > maxarg) maxarg = narg;
+		    	    args = first + narg - 1;
+			}
+		    }
+		}
+		if (*args) {
+		    width = (int)mathevali(*args++);
+		    if (errflag) {
+			errflag = 0;
+			ret = 1;
+		    }
+		}
 	    }
 	    *d++ = '*';
 
 	    if (*c == '.') {
-		c++;
-		if (*c == '*') {
-		    prec = (*args) ? (int)mathevali(*args++) : 0;
-		    if (errflag) {
-	    	    	errflag = 0;
-			ret = 1;
+		if (*++c == '*') {
+		    if (idigit(*++c)) {
+			narg = strtoul(c, &endptr, 0);
+			if (*endptr == '$') {
+			    c = endptr + 1;
+			    if (narg > argc || narg <= 0) {
+		    		zwarnnam(name,
+				    "%d: argument specifier out of range",
+				    0, narg);
+				return 1;
+			    } else {
+		    		if (narg > maxarg) maxarg = narg;
+		    		args = first + narg - 1;
+			    }
+			}
+		    }
+		    
+		    if (*args) {
+			prec = (int)mathevali(*args++);
+			if (errflag) {
+			    errflag = 0;
+			    ret = 1;
+			}
 		    }
-		    c++;
 		} else if (idigit(*c)) {
 		    prec = strtoul(c, &endptr, 0);
 		    c = endptr;
@@ -3155,30 +3211,30 @@
 	    /* ignore any size modifier */
 	    if (*c == 'l' || *c == 'L' || *c == 'h') c++;
 
+	    if (!curarg && *args) curarg = *args++;
 	    d[1] = '\0';
 	    switch (*d = *c) {
 	    case 'c':
-		if (*args) {
-		    intval = **args;
-		    args++;
+		if (curarg) {
+		    intval = *curarg;
 		} else
 		    intval = 0;
 		print_val(intval);
 		break;
 	    case 's':
-		stringval = *args ? *args++ : &nullstr;
+		stringval = curarg ? curarg : &nullstr;
 		print_val(stringval);
 		break;
 	    case 'b':
-		if (*args) {
+		if (curarg) {
 		    int l;
-		    char *b = getkeystring(*args++, &l, ops['b'] ? 2 : 0, &nnl);
+		    char *b = getkeystring(curarg, &l, ops['b'] ? 2 : 0, &nnl);
 		    fwrite(b, l, 1, fout);
 		    count += l;
 		}
 		break;
 	    case 'q':
-		stringval = *args ? bslashquote(*args++, NULL, 0) : &nullstr;
+		stringval = curarg ? bslashquote(curarg, NULL, 0) : &nullstr;
 		*d = 's';
 		print_val(stringval);
 		break;
@@ -3200,7 +3256,7 @@
 		type=3;
 		break;
 	    case 'n':
-		if (*args) setiparam(*args++, count);
+		if (curarg) setiparam(curarg, count);
 		break;
 	    default:
 	        if (*c) {
@@ -3208,20 +3264,21 @@
 	            c[1] = '\0';
 		}
 		zwarnnam(name, "%s: invalid directive", start, 0);
-		ret = 1;
 		if (*c) c[1] = save;
+		if (fout != stdout)
+		    fclose(fout);
+		return 1;
 	    }
 
 	    if (type > 0) {
-		if (*args && (**args == '\'' || **args == '"' )) {
+		if (curarg && (*curarg == '\'' || *curarg == '"' )) {
 		    if (type == 2) {
-			doubleval = (unsigned char)(*args)[1];
+			doubleval = (unsigned char)curarg[1];
 			print_val(doubleval);
 		    } else {
-			intval = (unsigned char)(*args)[1];
+			intval = (unsigned char)curarg[1];
 			print_val(intval);
 		    }
-		    args++;
 		} else {
 		    switch (type) {
 		    case 1:
@@ -3229,7 +3286,7 @@
  		    	*d++ = 'l';
 #endif
 		    	*d++ = 'l', *d++ = *c, *d = '\0';
-			zlongval = (*args) ? mathevali(*args++) : 0;
+			zlongval = (curarg) ? mathevali(curarg) : 0;
 			if (errflag) {
 			    zlongval = 0;
 			    errflag = 0;
@@ -3238,8 +3295,8 @@
 			print_val(zlongval)
 			break;
 		    case 2:
-			if (*args) {
-			    mnumval = matheval(*args++);
+			if (curarg) {
+			    mnumval = matheval(curarg);
 			    doubleval = (mnumval.type & MN_FLOAT) ?
 			    	mnumval.u.d : (double)mnumval.u.l;
 			} else doubleval = 0;
@@ -3255,9 +3312,9 @@
  		    	*d++ = 'l';
 #endif
 		    	*d++ = 'l', *d++ = *c, *d = '\0';
-			zulongval = (*args) ? mathevali(*args++) : 0;
+			zulongval = (curarg) ? mathevali(curarg) : 0;
 			if (errflag) {
-			    doubleval = 0;
+			    zulongval = 0;
 			    errflag = 0;
 			    ret = 1;
 			}
@@ -3265,10 +3322,13 @@
 		    }
 		}
 	    }
+	    if (maxarg && (args - first > maxarg))
+	    	maxarg = args - first;
 	}
 
+    	if (maxarg) args = first + maxarg;
     /* if there are remaining args, reuse format string */
-    } while (*args && args != first);
+    } while (*args && args != first && !ops['r']);
 
     if (fout != stdout)
 	fclose(fout);
Index: Test/B03print.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/B03print.ztst,v
retrieving revision 1.2
diff -u -r1.2 B03print.ztst
--- Test/B03print.ztst	2001/10/16 11:16:11	1.2
+++ Test/B03print.ztst	2001/10/18 14:11:41
@@ -50,7 +50,9 @@
 0:test b format specifier
 >	\
 
-# test %q here - it doesn't quite work yet
+ printf '%q\n' '=a=b \ c!'
+0: test q format specifier
+>\=a=b\ \\\ c!
 
  printf '%c\n' char
 0:test c format specifier
@@ -108,6 +110,19 @@
 ?(eval):1: bad math expression: operator expected at `a'
 >0
 
+ printf '%12$s' 1 2 3
+1:out of range argument specifier
+?(eval):printf:1: 12: argument specifier out of range
+
+ printf '%2$s\n' 1 2 3
+1:out of range argument specifier on format reuse
+?(eval):printf:1: 2: argument specifier out of range
+>2
+
+ printf '%*0$d'
+1:out of range argument specifier on width
+?(eval):printf:1: 0: argument specifier out of range
+
  print -m -f 'format - %s.\n' 'z' a b c
 0:format not printed if no arguments left after -m removal
 
@@ -129,7 +144,8 @@
 >two	b:0x2%
 >three	c:0x3%
 
- printf '%0+- #-08.5dx\n' 123
+# this should fill spec string with '%0+- #*.*d\0' - 11 characters
+ printf '%1$0+- #-08.5dx\n' 123
 0:maximal length format specification
 >+00123  x
 
@@ -140,3 +156,41 @@
  printf '%.*g\n' -1 .1
 0:negative precision specified
 >0.1
+
+ printf '%2$s %1$d\n' 1 2
+0:specify argument to output explicitly
+>2 1
+
+ printf '%3$.*1$d\n' 4 0 3
+0:specify output and precision arguments explicitly
+>0003
+
+ printf '%2$d%1$d\n' 1 2 3 4
+0:reuse format where arguments are explictly specified
+>21
+>43
+
+ printf '%1$*2$d' 1 2 3 4 5 6 7 8 9 10;echo
+0:reuse of specified arguments 
+> 1   3     5       7         9
+
+ printf '%1$0+.3d\n' 3
+0:flags mixed with specified argument
+>+003
+
+# The following usage, as stated in the manual, is not recommended and the
+# results are undefined. Tests are here anyway to ensure some form of
+# half-sane behaviour.
+
+ printf '%2$s %s %3$s\n' Morning Good World
+0:mixed style of argument selection
+>Good Morning World
+
+ printf '%*1$.*d\n' 1 2
+0:argument specified for width only
+>00
+
+ print -f '%*.*1$d\n' 1 2 3
+0:argument specified for precision only
+>2
+>000



Messages sorted by: Reverse Date, Date, Thread, Author