Re: Languages supported by UTF8 and UTF16

From: Michael Everson (
Date: Sat Sep 10 2005 - 14:39:23 CDT

  • Next message: Mark Davis: "Re: Languages supported by UTF8 and UTF16"

    At 12:23 -0700 2005-09-10, Mark Davis wrote:

    >1. It is not true of "all living languages"; there are some minority
    >languages that need additional characters. (Part of the problem here
    >is that we didn't apply the generative model consistently enough;
    >had we done that, many of these characters could be represented
    >right now by sequences.)

    Well you'd have to give examples of what you mean by THAT, Mark.

    >3. The 'however' is misleading. It is not a deficiency that some of
    >what users may perceive of as separate characters are encoded by

    No, but it's a problem, because font guys usually precompose, and
    only precomposed glyphs are **guaranteed** 'safe' for good,
    consistent typography.

    >4. Also not a deficiency. If Unicode attempted to encode all
    >typographic constructs, it would be a horrible mess. It provides a
    >foundation for other mechanisms (CSS, etc) to build upon; they can
    >provide typographical constructs. And by 'orthographic constructs',
    >you'd have to provide examples of what you mean.

    What's a typographical construct, Mark?

    > > Some of the properties of characters as defined by the
    > > Unicode Standard do not correspond to their behavior in different
    > > languages.
    >5. Again, you'd have to provide examples to clarify what you mean.

    He probably means something like Russian-vs-Serbian italic small TE.

    >What the Unicode Consortium *does* provide is a mechanism for
    >providing language-specific tailorings of specified behavior. Look
    >at collation, for example, where the Unicode Consortium supplies a
    >default basis for ordering in the UCA, but then also provides a
    >repository of language-based tailorings of the UCA in the CLDR.

    Mark, we are a lo-o-o-ng way from user-tailorable collation on ANY platform.

    Michael Everson *

    This archive was generated by hypermail 2.1.5 : Sat Sep 10 2005 - 14:45:04 CDT