Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: strange glob parsing



On Tue, Jun 3, 2025, at 12:14 AM, dana wrote:
> On Mon 2 Jun 2025, at 22:19, Lawrence Velázquez wrote:
>> It's not what I expect.  I believe that POSIX requires the synonymous
>> "[!]" to match a literal '[' followed by a literal '!' followed by
>> a literal ']', and bash, dash, and ksh perform matching this way.
>
> i agree that it's unexpected but not sure about the rest? posix isn't
> very explicit but it implies (in 9.3.5) that the body of the bracket
> expression (the ... in [...] or [^...]) can't be empty. all the regex
> engines i tested treat both [] and [^] as erroneous so that seems to be
> the case

Right, but the difference is that shell pattern matching shouldn't
throw an error in this case:

	A <left-square-bracket> that does not introduce a valid
	bracket expression shall match the character itself.

https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/V3_chap02.html#tag_19_14_01


> linux glob(7) is clearer:
>
>> The string enclosed by the brackets cannot be empty
>
> and the context implies that that extends to [!...]
>
> afaict bash treats both [] and [^] as never-matching, which is in
> accordance with that

They do match in bash, but they match the strings "[]" and "[^]"
literally, which is admittedly hard to see.  The failglob shopt
helps:

	% cat /tmp/foo.bash
	shopt -s failglob

	rm -fr /tmp/foo && mkdir "$_" && cd "$_" || exit

	echo []
	echo [!]
	echo [^]

	touch '[]' '[!]' '[^]' || exit

	echo []
	echo [!]
	echo [^]

	% bash /tmp/foo.bash
	/tmp/foo.bash: line 5: no match: []
	/tmp/foo.bash: line 6: no match: [!]
	/tmp/foo.bash: line 7: no match: [^]
	[]
	[!]
	[^]


> the [^] case is the one that's arguably unexpected. that's due to
> workers/35131. i'm not sure whether that specific behaviour with '^' was
> intended or not. i suppose if [] matches nothing it makes a certain
> sense for [^] to match everything, but as mentioned bash works in a
> different way which is also logical and more consistent with regex

If nothing else, it might be worth considering an adjustment to
sh/ksh compatibility mode.


-- 
vq




Messages sorted by: Reverse Date, Date, Thread, Author