Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] [[:blank:]] only matches on SPC and TAB

Stephane already quoted some man pages, but here is what the C99/C11
standards say:

"The isblank function tests for any character that is a standard blank
character or is one of a locale-specific set of characters for which
isspace is true and that is used to separate words within a line of
text. The standard blank characters are the following: space (' '),
and horizontal tab ('\t'). In the "C" locale, isblank returns true
only for the standard blank characters."

And Posix seems to say the same: it defines blank for the C locale
and states that in other locales it should at least encompass space
and tab.

So in other locales it seems to be totally undefined what a blank is,
and everybody does what they think is good choice. Thus the mess
Stephane observed. In fact, I looked at the musl library and found
this code:
int isblank(int c)
	return (c == ' ' || c == '\t');
int __isblank_l(int c, locale_t l)
	return isblank(c);
So they completely ignore the locale and just use the bare minimum
required by the standard. So after the patch, zsh would not only
behave differently on different platforms but would also change it's
behavior if you link with a different libc. 

Nevertheless, I'm slightly in favour of the patch. While defining our
own :blank: for other locales might give us consistency across
platforms, I think it will end up to be different than what everybody
else does and will thus lead to unexpected results for users -- in
particular if the libc's start to agree on isblank for different
locales. And at that point, it might be difficult to change the
behavior if it breaks backward compatibility.

In fact, it's the hope that the situation will improve in the future
that sways me towards the patch compared to the status-quo. But seeing
the mess Stephane uncovered made it a very tight race.

Finally, whether the patch gets applied or not, the documentation
should definitely be updated to reflect the issues around :blank:.


Messages sorted by: Reverse Date, Date, Thread, Author