Re: Languages supported by UTF8 and UTF16

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Sun Sep 11 2005 - 13:57:43 CDT

  • Next message: Mark Davis: "Re: Languages supported by UTF8 and UTF16"

    ----- Original Message -----
    From: "Anto'nio Martins-Tuva'lkin" <antonio@tuvalkin.web.pt>
    To: <unicode@unicode.org>
    Sent: Sunday, September 11, 2005 1:55 PM
    Subject: Re: Languages supported by UTF8 and UTF16

    > On 2005.09.10, 23:40, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
    >
    >> Unicode contains _most_ accented letters used in human languages
    >> as precomposed characters, but not all. There's a clear distinction
    >> here.
    >
    > Considering what canonical decomposition means, and that e.g. U+006F
    > U+0301 is absolutely identical to U+00F3, that distinction, however clear,
    > is meaningless.

    However, they may render differently! For example, Lucida Sans Unicode
    Version 2.0 (dated 1993) has U+0323 combining dot below, but not U+1E6D,
    LATIN SMALL LETTER T WITH DOT BELOW. So U+0074 U+0323 is rendered from the
    font, but U+1E6D is not, despite their having identical meanings.

    Richard.



    This archive was generated by hypermail 2.1.5 : Sun Sep 11 2005 - 13:59:05 CDT