The nice thing with TYPESET_TO_UNSET is that it makes a clear distinction between uninitialized and initialized variables. For references, that distinction makes a big difference because uninitialized references are placeholders that behave very differently from initialized references. The Zsh code "typeset -i var; var=1; var=2" can be loosely translated to the C code "int var; var=1; var=2". There is no fundamental difference between the initialization and subsequent assignments; both translate to the same C operation. Things are very different for the Zsh code "typeset -n ref; ref=var; ref=42", which can be loosely translated to the C code "int *ref; ref=&var; *ref=42". In this case the initialization and subsequent assignments translate to very different operations.
What I find absurd and detrimental is when we try to equate a reference initialized with the empty string with a placeholder. To me this looks as absurd as pretending that a C int pointer to 0 is the same thing as a null int pointer. It is detrimental because it implies that a statement like "typeset -nu ref=$1" will not necessarily initialize the reference. Innumerable users will have to figure out why their script exhibits a completely weird behavior only to finally find out that it's because they mistakenly passed in an empty string instead of the desired variable name. A common error that would have been immediately caught if instead the "typeset" statement would have complained about an invalid variable name.
Bart, you say "as things stand "typeset -n ref=" is a necessity.". I really don't see why? Especially given the fact that it's already the case that "typeset -a arr=" is not accepted. And rightly so! An empty string is obviously neither an empty array, nor an array that contains just the empty string.
Can you give a concrete example of something fundamental that would break down if we don't accept "typeset -n ref="?
I really don't see anything fundamental that wouldn't work. I can see that when TYPESET_TO_UNSET is disabled, there would be a slight discrepancy between references and other types of variables because for references "typeset -p" would not include any value for a reference defined with no initialization value, while for other types of variables it includes the type's default value.
I frankly doubt that any user would blame us for this. Who cares about that? And why? For sure, there will be orders of magnitude more users that will spend time debugging a misbehaving "typeset -nu ref=$1" than users complaining about the fact that "typeset -n ref; typeset -p ref" doesn't yield an output that includes a default value.
If we really really wanted a default value for references, then we should at least adopt something that makes sense. Like for arrays, that is not the case of the empty string. Here, what would make sense is a token (not a string) that is more or less an equivalent of NULL in C. Maybe we could adopt "<null>" for that. So, when TYPESET_TO_UNSET is disabled, "typeset -n ref" would be equivalent to "typeset -p ref=<null>". If "ref" is a reference, then you could write "ref=<null>" to turn it back into a placeholder. Like the ( ... ) syntax, the <null> token could only be used in the right hand side of assignments. So, like ( ... ), you could never pass it as an argument. Thus, "typeset -nu ref=$1" would always initialize "ref", even if "$1" is equal to the empty string or the string "<null>", which would both trigger an "invalid variable name" error.
Currently you can do the following:
typeset -i var;
while ( ... ); do {
var=0; # Reset "var" to it's default value
...
}
You can do the same for strings, floats, and arrays. You just have to use the appropriate default value. However, for references, that is currently not possible. Which is just another proof that the empty string is a bogus default value for references. If we adopted the <null> token, then the same would also work for references.
Should we adopt <null>? In my opinion, we don't really need a default value for references, so I would rather do without it. However, if we think that a default value is absolutely needed, then yes, we should adopt it. We should not adopt some half-baked default value like the empty string that causes far more troubles than solves issues but go all the way and adopt a true default value that effectively works like a default value.
Philippe