Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
[PATCH] zformat: add -qQ options to auto-escape %s
- X-seq: zsh-workers 54602
- From: dana <dana@xxxxxxx>
- To: zsh-workers@xxxxxxx
- Subject: [PATCH] zformat: add -qQ options to auto-escape %s
- Date: Sat, 23 May 2026 02:01:44 -0500
- Archived-at: <https://zsh.org/workers/54602>
- Feedback-id: i9be146f9:Fastmail
- List-id: <zsh-workers.zsh.org>
we were discussing w/54573 last week and mikael suggested that zformat
could have an option that quotes %s in the specs for you, for when the
result will be subject to further expansion (as is often the case in
completion functions)
i like that idea because: (1) the little ${...//\%/%%} dance is annoying
and (2) using that method to escape can actually break width specifiers
and ternary test numbers in the format string. extremely contrived
example:
% d=%x
% zformat -F REPLY '2 chars i wanted: %.2d' d:${d//\%/%%}
% echo $REPLY
2 chars i wanted: %% # oops, garbage
with these new options, zformat will quote %s in the specs and adjust
width specifiers and ternary test numbers to account for the extra
characters:
% d=%x
% zformat -qF REPLY '2 chars i wanted: %.2d' d:$d
% echo $REPLY
2 chars i wanted: %%x # extra % ignored, will have 2 chars as
# expected after subsequent processing
the -qn form can be used to implement oliver's suggestion in w/54578
that some specs be left un-escaped. though i haven't fully thought that
idea through yet
basically this is phase one of an alternate solution for w/54573
as part of this change i made zformat use the normal option-parsing
facilities. this shouldn't cause any issues for existing scripts
dana
diff --git a/Doc/Zsh/mod_zutil.yo b/Doc/Zsh/mod_zutil.yo
index a36104167..0bda242c9 100644
--- a/Doc/Zsh/mod_zutil.yo
+++ b/Doc/Zsh/mod_zutil.yo
@@ -158,11 +158,13 @@ var(pattern) matches at least one of the strings in the value.
enditem()
)
findex(zformat)
-xitem(tt(zformat -f) var(param) var(format) var(spec) ...)
-xitem(tt(zformat -F) var(param) var(format) var(spec) ...)
+xitem(tt(zformat -f) [ tt(-qQ) [ var(n) ] ] var(param) var(format) var(spec) ...)
+xitem(tt(zformat -F) [ tt(-qQ) [ var(n) ] ] var(param) var(format) var(spec) ...)
item(tt(zformat -a) var(array) var(sep) var(spec) ...)(
-This builtin provides different forms of formatting. The first form
-is selected with the tt(-f) option. In this case the var(format)
+This builtin provides different forms of formatting.
+
+The first form is selected with the tt(-f) option.
+In this case the var(format)
string will be modified by replacing sequences starting with a percent
sign in it with strings from the var(spec)s. Each var(spec) should be
of the form `var(char)tt(:)var(string)' which will cause every
@@ -176,9 +178,16 @@ width makes the result be padded with spaces to the right if the
var(string) is shorter than the requested width. Padding to the left
can be achieved by giving a negative minimum field width. If a maximum
field width is specified, the var(string) will be truncated after that
-many characters. After all `tt(%)' sequences for the given var(spec)s
-have been processed, the resulting string is stored in the parameter
-var(param). The sequence `tt(%%)' can be used to produce a literal tt(%).
+many characters.
+
+Any `tt(%)' sequence in tt(format) that does not match a given var(spec)
+(or one of the special sequences described below) is output as-is. If
+desired, these sequences may be processed by a second round of
+formatting, by prompt expansion, etc. DASH()- see also tt(-q).
+
+The sequence `tt(%%)' can be used to produce a literal tt(%).
+After all `tt(%)' sequences have been processed, the resulting string is
+stored in the parameter var(param).
The tt(%)-escapes also understand ternary expressions in the form used by
prompts. The tt(%) is followed by a `tt(LPAR())' and then an ordinary
@@ -213,7 +222,36 @@ number the condition is true when the width is em(greater than) that
number, and with a negative number the condition is true when the width
is em(less than or equal to) the absolute value of that number.
-The form, using the tt(-a) option, can be used for aligning
+With tt(-q), `tt(%)' characters in the var(spec)s are escaped as they
+are inserted into the formatted string, and pre-escaped `tt(%)'
+characters in the format string are left as they are. For example:
+
+example(zformat -qF REPLY '%%foo%% %B%d%b' d:%bar%)
+
+outputs `tt(%%foo%% %B%%bar%%%b)' to tt(REPLY). This is useful when the
+formatted string will undergo further expansion DASH()- in this example
+the tt(%B)...tt(%b) sequences could be used with prompt expansion to
+produce bold text. One notable use case is formatting a description to
+be passed to tt(compadd -x) in a completion function.
+
+tt(-q) may be followed by an optional integer argument var(n) to escape
+only the first var(n) var(spec)s. For example,
+
+example(zformat -Fq1 REPLY '%D %d' d:%foo% D:%bar%)
+
+outputs `tt(%bar% %%foo%%)' to tt(REPLY). This is the only case where
+the order that var(spec)s are given in is significant.
+
+Since the output with tt(-q) is expected to be subject to further
+processing, width specifiers don't count the extra escape characters,
+ensuring that the widths are correct em(after) that processing.
+Additionally, with tt(-F), ternary-expression test numbers are compared
+against the em(pre-escaped) spec lengths.
+
+tt(-Q) is like tt(-q) except that it interprets pre-escaped `tt(%)'
+characters in the format string as normal.
+
+The form using the tt(-a) option can be used for aligning
strings. Here, the var(spec)s are of the form
`var(left)tt(:)var(right)' where `var(left)' and `var(right)' are
arbitrary strings. These strings are modified by replacing the colons
diff --git a/Src/Modules/zutil.c b/Src/Modules/zutil.c
index 53b3abe72..0367669cf 100644
--- a/Src/Modules/zutil.c
+++ b/Src/Modules/zutil.c
@@ -809,11 +809,13 @@ bin_zstyle(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
* olenp *olenp is the size allocated for *outp
* endchar Terminator character in addition to `\0' (may be '\0')
* presence -F: Ternary expressions test emptyness instead
+ * quote -q: >0 if instr should be %-quoted
+ * qspecs qspecs[c] is the number of %s added by %-quoting
* skip If 1, don't output, just parse.
*/
static char *zformat_substring(char* instr, char **specs, char **outp,
int *ousedp, int *olenp, int endchar,
- int presence, int skip)
+ int presence, int quote, int *qspecs, int skip)
{
char *s;
@@ -848,7 +850,12 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
// literally
if (!testit && (!*s || *s == '%' || *s == ')' || *s == '-' || *s == '.')) {
// but swallow the % if this is %% or %)
- start += (s - start == 1 && (*s == '%' || *s == ')'));
+ if (!quote) {
+ start += (s - start == 1 && (*s == '%' || *s == ')'));
+ // if quoting, only swallow with %). admittedly this is confusing
+ } else {
+ start += (s - start == 1 && *s == ')');
+ }
s = start;
}
@@ -869,6 +876,8 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
actval = strlen(specs[(unsigned char) *s]);
else
actval = 1;
+ // don't count extra %s from quoting when testing this
+ actval -= qspecs[(unsigned char) *s];
actval = right ? (testval < actval) : (testval >= actval);
} else {
if (right) /* put the sign back */
@@ -887,20 +896,36 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
* Either skip true text and output false text, or
* vice versa... unless we are already skipping.
*/
- if (!(s = zformat_substring(s+1, specs, outp, ousedp,
- olenp, endcharl, presence, skip || actval)) || !*s)
+ if (!(s = zformat_substring(s+1, specs, outp, ousedp, olenp,
+ endcharl, presence, quote, qspecs,
+ skip || actval)) || !*s)
return NULL;
- if (!(s = zformat_substring(s+1, specs, outp, ousedp,
- olenp, ')', presence, skip || !actval)) || !*s)
+ if (!(s = zformat_substring(s+1, specs, outp, ousedp, olenp,
+ ')', presence, quote, qspecs,
+ skip || !actval)) || !*s)
return NULL;
} else if (skip) {
continue;
} else if ((spec = specs[(unsigned char) *s])) {
- int len;
+ int len, smin = min, smax = max;
+
+ // the assumption with quoted specs is that the output will be
+ // subject to further % expansion -- adjust width specifiers so
+ // so that the result will be correct *after* that expansion
+ if ((smin > 0 || smax > 0) && qspecs[(unsigned char) *s]) {
+ int i;
+ for (i = 0; spec[i]; i++) {
+ if (spec[i] == '%') {
+ smin += (smin > 0 && i < smin) ? 1 : 0;
+ smax += (smax > 0 && i < smax) ? 1 : 0;
+ i++;
+ }
+ }
+ }
- if ((len = strlen(spec)) > max && max >= 0)
- len = max;
- outl = (min >= 0 ? (min > len ? min : len) : len);
+ if ((len = strlen(spec)) > smax && smax >= 0)
+ len = smax;
+ outl = (smin >= 0 ? (smin > len ? smin : len) : len);
if (*ousedp + outl >= *olenp) {
int nlen = *olenp + outl + 128;
@@ -960,40 +985,91 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
}
static int
-bin_zformat(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
+bin_zformat(char *nam, char **args, Options ops, UNUSED(int func))
{
- char opt;
- int presence = 0;
+ unsigned char qopt = OPT_ISSET(ops, 'q') ? 'q' : OPT_ISSET(ops, 'Q') ? 'Q' : 0;
+ int presence = 0, quote = INT_MAX;
- if (args[0][0] != '-' || !(opt = args[0][1]) || args[0][2]) {
- zwarnnam(nam, "invalid argument: %s", args[0]);
+ if (OPT_ISSET(ops, 'q') && OPT_ISSET(ops, 'Q')) {
+ zwarnnam(nam, "only one of -qQ allowed");
+ return 1;
+ }
+ // the error here is more meaningful than the following ones with e.g. -q1F
+ if (OPT_HASARG(ops, qopt)) {
+ char *qptr;
+ quote = (int) zstrtol(OPT_ARG(ops, qopt), &qptr, 10);
+ if (quote < 0 || *qptr) {
+ zwarnnam(nam, "bad argument to -%c: %s", qopt, OPT_ARG(ops, qopt));
+ return 1;
+ }
+ }
+ if (OPT_ISSET(ops, 'a') + OPT_ISSET(ops, 'f') + OPT_ISSET(ops, 'F') < 1) {
+ zwarnnam(nam, "one of -afF expected");
+ return 1;
+ }
+ if (OPT_ISSET(ops, 'a') + OPT_ISSET(ops, 'f') + OPT_ISSET(ops, 'F') > 1) {
+ zwarnnam(nam, "only one of -afF allowed");
+ return 1;
+ }
+ if (OPT_ISSET(ops, 'a') && OPT_ISSET(ops, 'q')) {
+ zwarnnam(nam, "-q not allowed with -a");
return 1;
}
- args++;
- switch (opt) {
+ switch (OPT_ISSET(ops, 'a') ? 'a' : OPT_ISSET(ops, 'f') ? 'f' : 'F') {
case 'F':
presence = 1;
/* fall-through */
case 'f':
{
char **ap, *specs[256] = {0}, *out;
- int olen, oused = 0;
+ int i, olen, oused = 0;
+ int qspecs[256] = {0};
/* Parse the specs in argv. */
- for (ap = args + 2; *ap; ap++) {
+ for (i = 1, ap = args + 2; *ap; i++, ap++) {
if (!ap[0][0] || ap[0][0] == '-' || ap[0][0] == '.' ||
ap[0][0] == '%' || ap[0][0] == ')' ||
idigit(ap[0][0]) || ap[0][1] != ':') {
zwarnnam(nam, "invalid spec: %s", *ap);
return 1;
}
- specs[(unsigned char) ap[0][0]] = ap[0] + 2;
+
+ // need to quote specs here because zformat_substring() won't
+ // know the order
+ if (qopt && quote >= i) {
+ int len = 0, pct = 0;
+ char *aptr, *sptr, *spec = *ap + 2;
+
+ for (aptr = *ap + 2; *aptr; aptr++, len++) {
+ if (*aptr == '%') {
+ len++, pct++;
+ }
+ }
+
+ if (pct) {
+ spec = (char *) zhalloc(len + 1);
+ sptr = spec;
+ for (aptr = *ap + 2; *aptr; aptr++) {
+ *sptr++ = *aptr;
+ if (*aptr == '%') {
+ *sptr++ = *aptr;
+ }
+ }
+ *sptr = '\0';
+ }
+
+ specs[(unsigned char) ap[0][0]] = spec;
+ qspecs[(unsigned char) ap[0][0]] = pct;
+ } else {
+ specs[(unsigned char) ap[0][0]] = ap[0] + 2;
+ }
}
+
out = (char *) zhalloc(olen = 128);
if (!zformat_substring(args[1], specs, &out, &oused, &olen, '\0',
- presence, 0)) {
+ presence, qopt == 'q', qspecs, 0)) {
zwarnnam(nam, "malformed format string: %s", args[1]);
return 1;
}
@@ -1093,7 +1169,7 @@ bin_zformat(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
}
break;
}
- zwarnnam(nam, "invalid option: -%c", opt);
+ DPUTS(1, "BUG: unhandled option");
return 1;
}
@@ -2069,7 +2145,7 @@ bin_zparseopts(char *nam, char **args, Options ops, UNUSED(int func))
}
static struct builtin bintab[] = {
- BUILTIN("zformat", 0, bin_zformat, 3, -1, 0, NULL, NULL),
+ BUILTIN("zformat", 0, bin_zformat, 2, -1, 0, "afFq:%Q:%", NULL),
BUILTIN("zparseopts", 0, bin_zparseopts, 0, -1, 0, "a:A:DEFGKMn:v:", NULL),
BUILTIN("zregexparse", 0, bin_zregexparse, 3, -1, 0, "c", NULL),
BUILTIN("zstyle", 0, bin_zstyle, 0, -1, 0, NULL, NULL),
diff --git a/Test/V13zformat.ztst b/Test/V13zformat.ztst
index 545d5e615..32245b7e2 100644
--- a/Test/V13zformat.ztst
+++ b/Test/V13zformat.ztst
@@ -93,8 +93,53 @@
zformat REPLY ''
zformat REPLY '' x:
1:one of -f -F -a required
+?(eval):zformat:1: one of -afF expected
+?(eval):zformat:2: one of -afF expected
+
+ zformat -fff REPLY ''
+ zformat -FFF REPLY ''
+ zformat -aaa reply .
+0:duplicate -f -F -a ignored
+
+ zformat -af REPLY ''
+ zformat -fF REPLY '' x:
+1:more than one of -f -F -a not allowed
+?(eval):zformat:1: only one of -afF allowed
+?(eval):zformat:2: only one of -afF allowed
+
+ zformat -f
+ zformat -f REPLY
+ zformat -F
+ zformat -F REPLY
+1:-f and -F: param and format string required
+?(eval):zformat:1: not enough arguments
+?(eval):zformat:2: not enough arguments
+?(eval):zformat:3: not enough arguments
+?(eval):zformat:4: not enough arguments
+
+ zformat -a
+ zformat -a reply
+1:-a: param and separator required
?(eval):zformat:1: not enough arguments
-?(eval):zformat:2: invalid argument: REPLY
+?(eval):zformat:2: not enough arguments
+
+ zformat -a reply '' a:b && print -rl - $reply
+0:-a with empty separator
+>ab
+
+ zformat -F REPLY '<%1d>' 'd:é' && print -r - $REPLY
+ zformat -F REPLY '<%2d>' 'd:é' && print -r - $REPLY
+ zformat -F REPLY '<%3d>' 'd:é' && print -r - $REPLY
+ zformat -F REPLY '<%.1d>' 'd:é' && print -r - $REPLY
+ zformat -F REPLY '<%.2d>' 'd:é' && print -r - $REPLY
+ zformat -F REPLY '<%.3d>' 'd:é' && print -r - $REPLY
+0f:width specifier is multi-byte-aware
+><é>
+><é >
+><é >
+><é>
+><é>
+><é>
zformat -F REPLY %B && print -r - $REPLY
zformat -F REPLY %3B && print -r - $REPLY
@@ -227,3 +272,79 @@
0:ternary expression returning literal % or )
>%
>)
+
+ zformat -qa reply .
+ zformat -aq reply .
+1:-a + -q not allowed
+?(eval):zformat:1: -q not allowed with -a
+?(eval):zformat:2: -q not allowed with -a
+
+ zformat -Fq REPLY F && print -r - $REPLY
+ zformat -qF REPLY F && print -r - $REPLY
+ zformat -FqF REPLY F && print -r - $REPLY
+ zformat -q1 -F REPLY F && print -r - $REPLY
+ zformat -q1F REPLY F && print -r - $REPLY
+ zformat -q-1 -F REPLY F && print -r - $REPLY
+1:optional argument to -q
+>F
+>F
+>F
+>F
+?(eval):zformat:5: bad argument to -q: 1F
+?(eval):zformat:6: bad option: --
+
+# the spec order in the format string differs from the order in the arguments
+# here to make sure we're testing -qn's effects on the latter
+ for 1 in '' 0 1 2 3; do
+ zformat -Fq$1 REPLY '%%x %) %. %X %D %d' d:%foo% D:%bar% && print -r - $REPLY
+ done
+0:-q with and without optarg
+>%%x ) %. %X %%bar%% %%foo%%
+>%%x ) %. %X %bar% %foo%
+>%%x ) %. %X %bar% %%foo%%
+>%%x ) %. %X %%bar%% %%foo%%
+>%%x ) %. %X %%bar%% %%foo%%
+
+ zformat -Fq REPLY '%(x.%%/%d.%%/%D)' x:1 d:%foo% D:%bar% && print -r - $REPLY
+ zformat -Fq REPLY '%(X.%%/%d.%%/%D)' x:1 d:%foo% D:%bar% && print -r - $REPLY
+0:-q with ternary expression
+>%%/%%foo%%
+>%%/%%bar%%
+
+ zformat -Fq REPLY '<%1d>' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '<%5d>' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '<%6d>' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '<%7d>' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '<%-7d>' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '<%-7d>' d:foo && print -r - $REPLY
+0:-q: min-width specifier ignores extra %s
+><%%foo%%>
+><%%foo%%>
+><%%foo%% >
+><%%foo%% >
+>< %%foo%%>
+>< foo>
+
+ zformat -Fq REPLY '%.1d' d:foo && print -r - $REPLY
+ zformat -Fq REPLY '%.1d' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '%.4d' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '%.5d' d:%foo% && print -r - $REPLY
+0:-q: max-width specifier ignores extra %s
+>f
+>%%
+>%%foo
+>%%foo%%
+
+ zformat -Fq REPLY '%4(d.t.f) %5(d.t.f) %6(d.t.f)' d:%foo% && print -r - $REPLY
+ zformat -Fq REPLY '%-6(d.t.f) %-5(d.t.f) %-4(d.t.f)' d:%foo% && print -r - $REPLY
+0:-q: ternary width test ignores extra %s
+>t f f
+>t t f
+
+ zformat -FQ REPLY '%%x %) %. %X %D %d' d:%foo% D:%bar% && print -r - $REPLY
+ zformat -FQ0 REPLY '%%x %) %. %X %D %d' d:%foo% D:%bar% && print -r - $REPLY
+ zformat -FQ1 REPLY '%%x %) %. %X %D %d' d:%foo% D:%bar% && print -r - $REPLY
+0:-Q, -Q0, -Q1
+>%x ) %. %X %%bar%% %%foo%%
+>%x ) %. %X %bar% %foo%
+>%x ) %. %X %bar% %%foo%%
Messages sorted by:
Reverse Date,
Date,
Thread,
Author