Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Y01 Test Failure



> 2021/03/19 17:27, Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
> 
> Is this happening even with LC_COLLATE=C, or did we not bother setting
> that for this specific test?

LC_ALL is set to en_US.UTF-8 at the start of Y01completion.ztst.

I've been thinking that comparison of all-ASCII strings is the same
in C and UTF-8 locales. But it turned out that strcoll() behaves quite
*strangely* under en_US.UTf-8 on Linux.

If I run the following C-code:

#include <stdio.h>
#include <string.h>
#include <locale.h>

int main() {
    char* s[] = { "h", "i", "j" };
    setlocale(LC_COLLATE, "");
    for(int i=0; i<3; ++i) {
        printf("'%s'  - '<INSERT>' = %d\n", s[i], strcoll(s[i], "<INSERT>"));
    }
    return 0;
}

% export LC_COLLATE=C          
% ./a.out
'h'  - '<INSERT>' = 44
'i'  - '<INSERT>' = 45
'j'  - '<INSERT>' = 46
% export LC_COLLATE=en_US.UTF-8
% ./a.out
'h'  - '<INSERT>' = -11
'i'  - '<INSERT>' = -1
'j'  - '<INSERT>' = 1
% export LC_COLLATE=ja_JP.UTF-8
% ./a.out
'h'  - '<INSERT>' = 44
'i'  - '<INSERT>' = 45
'j'  - '<INSERT>' = 46

I can't understand the behavior under en_US.UTF-8 locale.




Messages sorted by: Reverse Date, Date, Thread, Author