Re: Languages supported by UTF8 and UTF16

From: Richard Wordingham ([email protected])
Date: Sun Sep 11 2005 - 13:57:43 CDT

  • Next message: Mark Davis: "Re: Languages supported by UTF8 and UTF16"

    ----- Original Message -----
    From: "Anto'nio Martins-Tuva'lkin" <[email protected]>
    To: <[email protected]>
    Sent: Sunday, September 11, 2005 1:55 PM
    Subject: Re: Languages supported by UTF8 and UTF16

    > On 2005.09.10, 23:40, Jukka K. Korpela <[email protected]> wrote:
    >
    >> Unicode contains _most_ accented letters used in human languages
    >> as precomposed characters, but not all. There's a clear distinction
    >> here.
    >
    > Considering what canonical decomposition means, and that e.g. U+006F
    > U+0301 is absolutely identical to U+00F3, that distinction, however clear,
    > is meaningless.

    However, they may render differently! For example, Lucida Sans Unicode
    Version 2.0 (dated 1993) has U+0323 combining dot below, but not U+1E6D,
    LATIN SMALL LETTER T WITH DOT BELOW. So U+0074 U+0323 is rendered from the
    font, but U+1E6D is not, despite their having identical meanings.

    Richard.



    This archive was generated by hypermail 2.1.5 : Sun Sep 11 2005 - 13:59:05 CDT