Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Match length and multibyte characters



>> % array=(a ä a)
>> % print ${${(O)array//(#m)*/${#MATCH}}[1]} ${${(ON)array%%*}[1]}
>> 1 2
>>
>> Can maybe someone shed some light on whether the second version is
>> supposed to work with multibyte characters and,
>
> The second version returns 2 because ä is a 2 byte character in UTF-8.
> This is a bug of the current zsh; all the flags N, B and E do not work
> well with multibyte characters in ${...#...}, ${...%...} etc.

Thanks for clearing that up. I was just unsure whether this is really
a bug or if there's another flag that I have to apply in order to make
it work with unicode characters, too.


> The patch below may fix the bug.

This is what I get after applying your patch:

/home/debian/zsh-5.0.7/obj/Src/../../Src/glob.c:2489: undefined
reference to `MB_METASTRLEN2END'
/home/debian/zsh-5.0.7/obj/Src/../../Src/glob.c:2495: undefined
reference to `MB_METASTRLEN2END'
/home/debian/zsh-5.0.7/obj/Src/../../Src/glob.c:2483: undefined
reference to `MB_METASTRLEN2END'

Might be due to my old version of 5.0.7, I didn't try 5.1.1. In any
case, I'd rather work around this bug until it gets fixed upstream
than patch each zsh on all of my machines individually.

> BTW, in your example, it is better to replace the flag (O) by (On)

True. I've used (On) during my tests but then forgot the crucial n in
my posting.

Best

erik



Messages sorted by: Reverse Date, Date, Thread, Author