Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

printf %s in UTF-8 is not POSIX-compliant



Hi,

Under UTF-8 locales:

vin:~> zsh-beta -f
vin% emulate sh
vin% printf ".%2s.\n" é
. é.
vin% /usr/bin/printf ".%2s.\n" é 
.é.
vin%

As you can see, the zsh printf builtin doesn't behave like the
coreutils printf, and this is zsh which is wrong. Indeed, the
precision is the number of bytes, not the number of characters.

http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html

says (in the extended description) that the "file format notation"
shall be used for the format (and %s isn't an exception).

http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap05.html

(file format notation) says:

  s
    The argument shall be taken to be a string and bytes from the
    string shall be written until the end of the string or the number
    of bytes indicated by the precision specification of the argument
    is reached. If the precision is omitted from the argument, it
    shall be taken to be infinite, so all bytes up to the end of the
    string shall be written.

Note: ksh93 has the same bug, but not pdksh and bash. But bash may
change its behavior if not under POSIX compatibility, see

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=459413

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)



Messages sorted by: Reverse Date, Date, Thread, Author