RE: Normalization Form KC for Linux

From: Addison Phillips (AddisonP@simultrans.com)
Date: Wed Aug 18 1999 - 15:58:55 EDT


Old fashioned FACE-like interfaces will not survive KC because they rely on
half- and full-width forms for positioning.

There are a large number of these. While we can all wish that these hoary
old beasts will go away, there are several hundred (thousand??) character
mode screens that I don't ever want to see again that rely on this stuff.

AP
        __________________________________________

        Addison Phillips
        Director, Globalization Consulting
        SimulTrans, L.L.C.
        AddisonP@simultrans.com (Internet email)
        http://www.simultrans.com (website)

        "22 languages. One release date."
        __________________________________________

-----Original Message-----
From: Markus Kuhn [mailto:Markus.Kuhn@cl.cam.ac.uk]
Sent: Wednesday, August 18, 1999 1:34 AM
To: Unicode List
Cc: recode-forum@iro.umontreal.ca
Subject: Normalization Form KC for Linux

I was never too happy with the UCS implementation levels, and after
reading Unicode Tech Report #15, I think I have now seen the light and I
have just added in

  http://www.cl.cam.ac.uk/~mgk25/unicode.html

in section "How should Unicode be used under Linux?" the following
paragraph:

  One day, combining characters will surely be supported under Linux, but
  even then the precomposed characters should be preferred over combining
  character sequences where available. More formally, the preferred way of
  encoding text in Unicode under Linux should be Normalization Form KC as
  defined in Unicode Technical Report #15
  <http://www.unicode.org/unicode/reports/tr15/>.

I hope this recommendation meets general approval. I would even suggest
that programs such as less and ls could be extended to replace
characters on output by \xx hex escape sequences if they find in file
names or text files characters that are not conforming to Normalization
Form KC, such that these potential trouble-makers can be spotted more
easily by users.

It might be a very nice idea to have all the Unicode Normalization forms
added to GNU recode or iconv.

Markus

--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT