Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

PATCH: 3.1.5++/3.0.6 Re: Parameter substitution bug



"Bart Schaefer" wrote:
> On Apr 29,  6:07pm, Peter Stephenson wrote:
> > Subject: Parameter substitution bug
> > Can anybody explain this?
> > 
> > % cat tst
> > fdir='~/bin'
> > print ${~fdir}
> > % zsh -f ./tst
> > ~/bin
> 
> Because globsubst and ${~...} only do filename *generation* not filename
> *expansion*.  It's always been like this ... you never noticed?

No, that's not the answer, and it certainly hasn't always been like this.

1. Far from never noticing, I originally wrote the option (it had a
different name then) *specifically* to expand ~'s from variables, as csh
did.  The globbing stuff only got added in later on, then the name got
changed --- there was a note in the FAQ, but it became out of date some
time ago.  That's why the flag ~ was picked originally; it didn't get
changed when the option name did.  In fact, this is what it says in the
manual:

  Treat any characters resulting from parameter expansion as being
  eligible for file expansion and filename generation, and any
  characters resulting from command substitution as being eligible for
  filename generation.


2. I've discovered what the problem was, and it certainly looks
unintentional to me:

% fdir='~/bin'
% unsetopt extendedglob
% print ${~fdir}
~/bin
% setopt extendedglob
% print ${~fdir}
/home/user2/pws/bin

I take it this is universally regarded as a bug?  Parameter expansion flags
certainly shouldn't do that.

This has come about because tokenize() only tokenizes a ~ if extendedglob
is set.  This shouldn't be necessary --- the globbing code was specially
rewritten so that it only uses a tokenized ~, ^, etc. when extendedglob is
set.  That's because the shell can parse a function when extendedglob is
not set, only to have the option set when the function is executed for use
within the function, with the result that it doesn't work because the
parsed strings aren't properly tokenized.  I think we should try the
following patch --- and if it doesn't work, the problem is somewhere else
anyway; the lexer has happily been unconditionally tokenizing ~, # and ^
for some time now.

Note that I've made it so that ^, # and ~ are now tokenized when SHGLOB is
set --- the same argument applies here, except that this has the effect
that extendedglob is independent of shglob for retokenized characters;
since it always was for characters lexed properly, I don't see a problem
with that.  However, probably SHGLOB should be handled somewhere other than
in the lexer for the same reason that if it's set when a function is
parsed, parentheses won't work in the corresponding function even if it's
unset (e.g. by `emulate -L zsh').

This should apply to 3.0.6 as well.

--- Src/glob.c.eg	Wed Apr 21 16:18:09 1999
+++ Src/glob.c	Fri Apr 30 09:48:42 1999
@@ -3379,16 +3379,14 @@
 	    *t = Inang;
 	    *s = Outang;
 	    break;
-	case '^':
-	case '#':
-	case '~':
-	    if (unset(EXTENDEDGLOB))
-		break;
 	case '(':
 	case '|':
 	case ')':
 	    if (isset(SHGLOB))
 		break;
+	case '^':
+	case '#':
+	case '~':
 	case '[':
 	case ']':
 	case '*':


-- 
Peter Stephenson <pws@xxxxxxxxxxxxxxxxx>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy



Messages sorted by: Reverse Date, Date, Thread, Author