Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Surprising behaviour with numeric glob sort



2017-06-05 20:13:54 -0700, Bart Schaefer:
[...]
> Like I said, I think it does this wrong.  If I'm reading the code
> correctly, it first compares the strings for absolute identity while
> searching for embedded nuls, and if they are identical up to the nul
> it then orders the shorter string before the longer one; otherwise
> it skips past the last nul and then relies on strcoll() for the rest
> of both strings.  It would seem to me that the collation order should
> be checked before any nul as well as after, otherwise the first loop
> might conclude the strings differ when strcoll() would order them the
> same.  (However, read below.)

I see, like in:

$ print -lo $'\u2461\0d' $'\u2463\0c' $'\u2460\0b' $'\u2462\0a' | tr -cd 'abcd\n'
d
c
b
a


Even though \u2460 .. \u2462 all sort the same in my locale, so
the order should be:

a
b
c
d

[...]
> (I don't think zero-padding will work as we
> don't know how many zeroes are needed to make the strings be the same
> number of digits.)

Yes, like I said, that would mean an extra scan of the whole
list to find the widest number.

Or, since most of the rest of zsh can't cope with decimal
integer numbers that are more than 19 digits, pad to 19 digits
(at the expense of memory and unnecessary byte comparisons when
it comes to comparing those large numbers of zeros), like in my
n() sorting function for *(o+n) as a replacement of *(n):
n() REPLY=${REPLY//(#m)<->/${(l:20::0:)MATCH}}

-- 
Stephane



Messages sorted by: Reverse Date, Date, Thread, Author