Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: UTF-8 FAQs



Clint Adams wrote:
> The number of questions I've been getting about zsh and UTF-8
> have skyrocketed over the past couple of weeks.
> 
> Perhaps some of these belong in the zsh FAQ.  Corrections to my stock
> answers are welcome.

I imported it like this.  It needs to go into the copy at the web site
(which used to be the master copy, but effectively this one is now;
they're both in CVS).

Index: Etc/FAQ.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Etc/FAQ.yo,v
retrieving revision 1.10
diff -u -r1.10 FAQ.yo
--- Etc/FAQ.yo	16 Aug 2004 09:53:10 -0000	1.10
+++ Etc/FAQ.yo	11 Jan 2005 17:35:54 -0000
@@ -43,14 +43,14 @@
 whenman(report(ARG1)(ARG2)(ARG3))\
 whenms(report(ARG1)(ARG2)(ARG3))\
 whensgml(report(ARG1)(ARG2)(ARG3)))
-myreport(Z-Shell Frequently-Asked Questions)(Peter Stephenson)(2004/08/13)
+myreport(Z-Shell Frequently-Asked Questions)(Peter Stephenson)(2005/01/11)
 COMMENT(-- the following are for Usenet and must appear first)\
 description(\
 mydit(Archive-Name:) unix-faq/shell/zsh
-mydit(Last-Modified:) 2001/08/13
+mydit(Last-Modified:) 2005/01/11
 mydit(Submitted-By:) email(pws@xxxxxxxxxxxxxxxxxxxxxxxx (Peter Stephenson))
 mydit(Posting-Frequency:) Monthly
-mydit(Copyright:) (C) P.W. Stephenson, 1995--2004 (see end of document)
+mydit(Copyright:) (C) P.W. Stephenson, 1995--2005 (see end of document)
 )
 
 This document contains a list of frequently-asked (or otherwise
@@ -88,6 +88,7 @@
 2.4. tcsh?
 2.5. bash?
 2.6. Shouldn't zsh be more/less like ksh/(t)csh?
+2.7. What is zsh's support Unicode/UTF-8?
 
 Chapter 3:  How to get various things to work
 3.1. Why does `$var' where `var="foo bar"' not do what I expect?
@@ -935,6 +936,46 @@
   help.
 
 
+sect(What is zsh's support for Unicode/UTF-8?)
+
+  `Unicode', or UCS for Universal Character Set, is the modern way
+  of specifying character sets.  It replaces a large number of ad hoc
+  ways of supporting character sets beyond ASCII.  `UTF-8' is an
+  encoding of Unicode that is particular natural on Unix-like systems.
+
+  Q: Does zsh support UTF-8?
+
+  A: zsh's built-in printf command supports "\u" and "\U" escapes
+  to output arbitrary Unicode characters.  ZLE (the Zsh Line Editor) has
+  no concept of character encodings, and is confused by multi-octet
+  encodings.
+
+  Q: Why doesn't zsh have proper UTF-8 support?
+
+  A: The code has not been written yet.
+
+  Q: What makes UTF-8 support difficult to implement?
+
+  A: In order to handle arbitrary encodings the correct way, significant
+  and intrusive changes must be made to the shell.
+
+  Q: Why can't zsh just use readline?
+
+  A: ZLE is not encapsulated from the rest of the shell.  Isolating it
+  such that it could be replaced by readline would be a significant
+  effort.  Furthermore, using readline would effect a significant loss of
+  features.
+
+  Q: What changes are planned?
+
+  A: Introduction of Unicode support will be gradual, so if you are
+  interested in being involved you should join the zsh-workers mailing
+  list.  As a first step ZLE will be rewritten to use wide characters
+  internally.  Character based widgets can then operate on a single wide
+  character instead of a single byte, and the proper display width can be
+  calculated with wcswidth().
+
+
 chapter(How to get various things to work)
 
 sect(Why does mytt($var) where mytt(var="foo bar") not do what I expect?)

-- 
Peter Stephenson <pws@xxxxxxx>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************



Messages sorted by: Reverse Date, Date, Thread, Author