Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: PATCH: (large) initial support for combining characters in ZLE.



Thank you for starting the combining character support!

At 17:54 +0100 08.4.13, Peter Stephenson wrote:
>the base character must be an alphanumeric (and
>I'm not sure about the numeric, I need to find a better definition), and

I think this is too restrictive, because in some Asian languages
(Japanese, Korean, Thai, etc.) the base character can be non-alphaget.
For example, in Japanese, Hiragana/Katakana can be combined with
U+3099 (VOICED SOUND MARK) or U+309A (SEMI-VOICED SOUND MARK).
Example: U+3057 U+3099 = "じ"
the base character U+3057 = "し" is not an alphanumeric.

>the zero-width characters afterwards (I haven't imposed a limit on how
>many there are) must be punctuation.

I guess this is also too restrictive. I have run the code like the following
on Fedora7:

wchar_t w;
setlocale(LC_ALL,"");
for(w=1; w<0x2ffff; ++w) {
	if(wcwidth(w)==0 && iswpunct(w)==0) {
		printf("%05x: %lc\n",w,w);
	}
}

It listed 166 characters, all of which seem to be combining chars in
Thai or Korean (U+0e4e and U+1160 may not be combining, I'm not sure).

I think strictly defining combined char is virtually impossible,
because there are so many "nonsensical" combinations like
"Hiragana + umlaut". Even within alphabet, a combination like
"x + U+0318" is almost as strange as "space + grave".

How about accepting any combination?
If terminal emulator displays garbage, the user can turn off the
option COMBINING_CHARS to see the hex code.



Messages sorted by: Reverse Date, Date, Thread, Author