Re: BUG: Initializations of named references with an empty string should trigger an error

I was going to demonstrate, but instead I just demonstrated what I
think may be a bug.

typeset -n ref=var
typeset -n ref

I don't think that this is a bug. The same happens with "typeset -i var=42; typeset -i var"; the "var" doesn't reset to "0". The same happens for strings and arrays. I think that (unfortunately) this is by design. It has two consequences:

1) When TYPESET_TO_UNSET is enabled, it's never possible to reset a variable to its uninitialized state with a single "typeset" statement.

2) For references, if "typeset -n ref=" is forbidden, then there is no way to reset a reference to a placeholder with a single "typeset" statement.

Though, in both cases, there is a workaround. You can always do "unset -n var; typeset -i var". This works for all types and always resets the variable to its default value or its uninitialized state. It's just a bit verbose. That's why I suggested the addition of a "-S" option (for hard/forced Set) to "typeset" that would implicitly do the "unset -n var". Regardless of what we decide for references, I still think this could be useful, especially for when TYPESET_TO_UNSET is enabled.

typeset -n ref=var
unset -n ref

also sort of does so, in that re-assigning or re-declaring ref revives
it. I think this is actually a bug -- "unset -n ref" should cause ref
to cease to be a named reference at all

I also observed this behavior but wasn't sure whether it was a bug or a feature. I have now noticed that it works at the top-level but not in functions:

typeset -n ref1=var; unset -n ref1; ref1=var; typeset -p ref1
() { typeset -n ref2=var; unset -n ref2; ref2=var; typeset -p ref2 }

Output

typeset ref1=var
typeset -n ref2=var

So there is definitely a bug. I guess that both should work like the first one, which is also what happens for "-i" variables.

Regarding "<null>", we had a long discussion around the time of the
implementation of typesettounset and eventually rejected the idea of
having an out-of-band value representing a null parameter. I would
rather not revisit that at this point, but if others feel as strongly
as Philippe does about declaring empty-value namerefs, then I suggest
we use "." (dot) instead of something like "<null>".

In principle, the syntax "." is still wrong because it's a valid string. It can still show up in the "$1" of a "typeset -nu ref=$1" where in principle it should be seen as an invalid variable name but instead will be interpreted as the default value for a reference placeholder. However, in practice, it's, in my opinion, orders of magnitude better than using the empty string for that purpose because the chances that someone will inadvertently pass a "." where a variable name is expected are extremely low. While inadvertently passing an empty string is very easy as a simple typo in a parameter expansion will often produce an empty string. With "." as the default value we get the following:

typeset -n ref=. # Good: creates a placeholder reference

ref=. # Ok: assigns "." to the referred variable or does nothing if "ref" is a placeholder

var=. # Ok: assigns "." to "var".

name=.; typeset -n ref=$name # Bad: creates a placeholder instead of reporting an "invalid variable name" error

The advantage of the syntax "<null>" is that it's currently NOT a valid _expression_. You can't use it on the right hand side of an "=". Neither "typeset -n ref=<null>" is valid nor "ref=<null>". The function call "foo <null>" is also invalid (but "foo aaa <null> zzz" is valid if there is a file named "null"). In principle, it should be possible to change the parser to recognize "<null>" as a special token and only accept it on the right of an "=". With "<null>" as the default value we get the following:

typeset -n ref=<null> # Good: creates a placeholder reference

ref=<null> # Good: resets an existing reference into a placeholder

var=<null> # Good: reports an "inconsistent type for assignment" error *

name="<null>"; typeset -n ref=$name # Good: reports an "invalid variable name" error

* Alternatively we could accept "var=<null>" as a shortcut to define a placeholder reference like "arr=()" is a shortcut to define an empty array but I would rather report an error.

In practice, recognizing "<null>" as a special token, even if that only needs to happen on the right of "=" may not be that easy. Furthermore the syntax "<null>" is not very Shell friendly; it's too verbose. Just "<>" would be better but maybe even more difficult to implement. I don't really care about the exact syntax. I'm fine with anything that works.

A middle ground between "." and a new syntax like "<null>" would be to reuse the syntax "()". It may first look a little weird as we are used to the fact that it represents an empty array but one could argue that "()" is better syntax for an empty/placeholder reference than "." as it evokes something that is empty. With "()" as the default value we get the following:

typeset -n ref=() # Good: creates a placeholder reference

typeset -n ref=(aa bb cc) # Good: reports an "inconsistent type for assignment" error **

ref=() # Ok: sets the referred variable to an empty array or does nothing if "ref" is a placeholder

var=() # Ok: sets "var" to an empty array

name="()"; typeset -n ref=$name # Good: reports an "invalid variable name" error

name=(); typeset -n ref=$name # Good: reports an "invalid variable name" error (because $name expands to "")

** I have just noticed that currently "typeset -n ref=(aa bb cc)" generates a "typeset -an ref=( aa bb cc )". I guess that's a bug. It should report an "inconsistent type for assignment" error, right?

Overall "()" is better than "." and only slightly worse than "<null>" (or rather not as good as "<null>"). Compared to "<null>", its main shortcoming is that it doesn't offer a short syntax to reset a reference to a placeholder.

In practice, if the parser is left as it is, "()" as the default would suffer from a few glitches. For example, I assume that "var=; typeset -n ref=($var)" would create a placeholder because the "typeset" statement would see an empty array because "$var" would already have been dropped in an earlier phase. I think that we could live with that but if desired, it could be addressed by changing the parser to recognize "()" as a special token.

It still think that no default value for references is a viable option. Its main shortcoming is that resetting an existing reference to a placeholder requires the use of "unset -n ref" (or the introduction of a new "-S" option for "typeset"). It also introduces the following discrepancy:

typeset str; echo -n "${(!)+str} - "; typeset -p str
typeset -i int; echo -n "${(!)+int} - "; typeset -p int
typeset -a arr; echo -n "${(!)+arr} - "; typeset -p arr
typeset -n ref; echo -n "${(!)+ref} - "; typeset -p ref

Output:

1 - typeset str=''
1 - typeset -i int=0
1 - typeset -a arr=( )

0 - typeset -n ref

Placeholder references would not be considered as set and would not include a value in the output of "typeset -p", even when TYPESET_TO_UNSET is disabled. It's a small discrepancy that I could easily live with.

Given all the options, here are my preferences from most favored to least favored:

- References use a new syntax (e.g., "<null>" or "<>") as the default value

- References use the syntax "()" as the default value

- References have no default value

- References use the string "." as the default value

- References use the string "" as the default value

It's becoming unhelpful for just Philippe and I to keep batting these
points back and forth, it's not building any kind of consensus.

Indeed, it would be good if a few others also chimed in.

Philippe

On Mon, Jun 9, 2025 at 2:29 AM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:

On Sun, Jun 8, 2025 at 3:21 PM Philippe Altherr
<philippe.altherr@xxxxxxxxx> wrote:
>
> Bart, you say "as things stand "typeset -n ref=" is a necessity.". I really don't see why?

I was going to demonstrate, but instead I just demonstrated what I
think may be a bug.

typeset -n ref=var
typeset -n ref

does not reset the state of ref to being a placeholder. It's still
got var as referent.

typeset -n ref=var
typeset -n ref=

does so, which is why I consider it necessary. However ...

typeset -n ref=var
unset -n ref

also sort of does so, in that re-assigning or re-declaring ref revives
it. I think this is actually a bug -- "unset -n ref" should cause ref
to cease to be a named reference at all, as explained by this bit of
doc:

Thus to remove a named reference, use either 'unset -n PNAME'
(preferred) or one of:

typeset -n PNAME=
typeset +n PNAME

followed by

unset PNAME

My original point was going to be that

typeset -n ref=var
typeset -n ref=
typeset -i ref

is intentionally an error ("can't change type of a named reference") whereas

typeset -n ref=var
unset -n ref
typeset -i ref

should work (but doesn't), and therefore ref= is a distinct,
necessary, operation.

Aside, note that

setopt typesettounset
typeset -n ref
unset ref

does not work, because ref is already unset so unsetting it again is a
no-op. You must use the incantations in that doc excerpt.

Regarding "<null>", we had a long discussion around the time of the
implementation of typesettounset and eventually rejected the idea of
having an out-of-band value representing a null parameter. I would
rather not revisit that at this point, but if others feel as strongly
as Philippe does about declaring empty-value namerefs, then I suggest
we use "." (dot) instead of something like "<null>". In which case I
believe declaring

unsetopt typesettounset
typeset -n ref
typeset -p ref

ought to print

typeset -n ref=.

the way that integers are initialized to 0, etc. Please note,
however, that fundamentally I would prefer NOT to introduce this.

It's becoming unhelpful for just Philippe and I to keep batting these
points back and forth, it's not building any kind of consensus.