Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] Support the mksh's ${|func;} substitution



On Wed, May 3, 2023 at 7:46 AM Sebastian Gniazdowski
<sgniazdowski@xxxxxxxxx> wrote:
>
> I was looking at some of my old patches and stumbled upon this one
> which implements mksh' ${|func;} substitution.
> [...] Original thread:
> https://zsh.org/mla/workers/2019/msg00768.html

I'm not going to try to recapitulate all the objections/discussions
from that thread, but here are some thoughts.

> [...] The old thread went
> over a complex (x) flag [but] ${!func;} doesn't block us from adding a
> more advanced x-flag in the future. [...]

An argument put forward in the code comments in this most recent
patch, namely that adding an (x) flag requires further revisions of
"14.3.3 Rules" in the documentation, does have some merit.  It would
be complicated to combine (x) with other flags in the same
substitution step, and even more convoluted to try to explain it.  I
think it's preferable to treat it more like a command substitution ala
$(...).

> To recall - the ${|func;} substitution runs function "func" and
> substitutes (returned string in…) $REPLY after it has completed,
> without (!) doing forks, which is a performance and functionality
> benefit over currently available equivalent: $(func;print -rn --
> $REPLY).

One of the significant bits from the previous thread was the notion
that the mksh syntax is actually ${|command} rather than just limited
to functions.  I looked at Sebastian's later patch (workers/44742)
that tried to implement this, but I think calling bin_eval() isn't
necessary.

Also in that thread Stephane points out that it would be ideal if
(paraphrasing his example) "${| echo { }" were parsed similarly to "{
echo { }" and that to do so would require changes to the parser, not
just to the substitution process.  However, we can get most of the way
there with what Sebastian has proposed and any future parser revisions
to take this out of paramsubst() would not (I think) need to
invalidate code written to this first approximation.

The other thing that bothered me about the idea is the hardcoded
reliance on $REPLY, so I took a stab at something to address that.

Given the foregoing, here's a proposed alternative to Sebastian's
patch.  Formal doc and tests pending reaction.  Informally,

${|code} passes the code through execstring() and then substitutes the
value of $REPLY.
${|VAR|code} passes code through execstring() and then substitutes the
value of $VAR.

This even works where VAR includes a namespace prefix or a subscript
expression, but VAR must otherwise look like a scalar parameter
reference.  The main limitation on code so interpolated is that it
can't contain unquoted { or } ... multi-line code is OK.

A remark about the last hunk of the patch:  I kept this from
Sebastian's original patch, it has the effect that ${|code} never
appears to be an unset parameter, even if that code does not set
REPLY.  I'm not certain that's actually necessary.
diff --git a/Src/subst.c b/Src/subst.c
index 14947ae36..c5c060d56 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -1860,6 +1860,8 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * joining the array into a string (for compatibility with ksh/bash).
      */
     int quoted_array_with_offset = 0;
+    /* Indicates ${|...;} */
+    char *rplyvar = NULL;
 
     *s++ = '\0';
     /*
@@ -1887,8 +1889,53 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * flags in parentheses, but also one ksh hack.
      */
     if (c == Inbrace) {
+	/* The command string to be run by ${|...;} */
+	char *cmdarg = NULL;
+	size_t slen = 0;
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the command-running substitution ${|cmd;}
+         * The command string is extracted and executed, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)cmd;}, because it looks quite awful and
+         * also requires a change to the manual description of the
+         * substitution order. Use ${(U)${|cmd;}} instead, it looks
+         * cleaner. */
+	if (*s == Bar && ((slen = strlen(s) - 1) > 1)) {
+	    char *outbracep = s + slen;
+	    if (outbracep[0] == Outbrace /* && outbracep[-1] == ';' */) {
+		
+		if ((rplyvar = itype_end(s+1, INAMESPC, 0))) {
+		    if (*rplyvar == Inbrack &&
+			(rplyvar = parse_subscript(++rplyvar, 1, ']')))
+			++rplyvar;
+		}
+		if (rplyvar == s+1 && *rplyvar == Bar) {
+		    /* Is ${||...} a subtitution error or a syntax error?
+		    zerr("bad substitution");
+		    return NULL;
+		    */
+		    rplyvar = NULL;
+		}
+		if (rplyvar && *rplyvar == Bar) {
+		    cmdarg = dupstrpfx(rplyvar+1, outbracep-rplyvar-1);
+		    rplyvar = dupstrpfx(s+1,rplyvar-s-1);
+		} else {
+		    cmdarg = dupstrpfx(s+1, outbracep-s-1);
+		    rplyvar = "REPLY";
+		}
+		s = outbracep;
+	    }
+	}
+
+	if (rplyvar && cmdarg && *cmdarg) {
+	    /* Execute the shell command */
+	    untokenize(cmdarg);
+	    execstring(cmdarg, 1, 0, "cmdsubst");
+	    s = dyncat(rplyvar, s);
+	}
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).  This is true only for arrays or
@@ -2588,8 +2635,11 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 			      ((unset(KSHARRAYS) || inbrace) ? 1 : -1)),
 			     scanflags)) ||
 	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
-	    (v->flags & VALFLAG_EMPTY))
-	    vunset = 1;
+	    (v->flags & VALFLAG_EMPTY))	{
+	    if (!rplyvar) {
+		vunset = 1;
+	    }
+	}
 
 	if (wantt) {
 	    /*


Messages sorted by: Reverse Date, Date, Thread, Author