RE: Compatibility Character (was: Re: Emoji: emoticons vs. literacy)

From: Phillips, Addison (addison@amazon.com)
Date: Wed Jan 14 2009 - 22:50:31 CST

  • Next message: Christopher Fynn: "Re: Compatibility Character (was: Re: Emoji: emoticons vs. literacy)"

    > Doug responded to Ken with:
    >
    > >>> "/Compatibility Character. /
    > >>> A character that would not have been encoded except for
    > >>> compatibility and round-trip convertibility with other
    > standards"
    > >
    > > Yukka Korpela responded:
    > >
    > >> It's a pseudo-definition.
    > >
    > > Which is nonsense, I'm afraid. What Asmus cited is a descriptive
    > > definition of the term, as used by the folks in the UTC
    > > (past and current) who have developed and maintain the standard.
    >
    > That is indeed the glossary definition, and the first sentence of
    > Section 2.3. However, the second sentence of Section 2.3
    > immediately
    > goes on to add that they are "variants of characters that already
    > have encodings as normal characters."
    >
    > So if the truncated definition as found in the glossary is the one
    > the folks in the UTC have been using, then the presence of the
    > following sentence is a bit misleading, and hopefully this will be clarified
    > in the 6.0 book.
    >

    I think you might be going too far. Ken is correct about the definition of a "compatibility" character. I must admit that frequently I've tended to think of it as a character with a compatibility decomposition. But this is very obviously not right (or at least incomplete). There is nothing wrong with the additional sentence: none of the compatibility characters would have been encoded, save for the compatibility problem, and all are variants of "normal" characters (or character sequences). This means, quite clearly, that the characters in question are already somehow encoded (perhaps as a combining sequence??). The actual quote is:

    --
    Conceptually, compatibility characters are those that would not have been encoded except
    for compatibility and round-trip convertibility with other standards. They are variants of
    characters that already have encodings as normal (that is, non-compatibility) characters in
    the Unicode Standard; as such, they are more properly referred to as compatibility variants.
    --
    In fact, section 2.3 is quite a bit longer than just these two sentences and it is important to understand it 'in toto' rather than trying to parse it word-by-word. Unicode 6.0 cannot "solve" this problem, because there are many ways to become a compatibility character. But all flow from the basic definition: a character already encoded except for round-trip or other compatibility considerations.
    Addison
    Addison Phillips
    Globalization Architect -- Lab126
    Internationalization is not a feature.
    It is an architecture.
    


    This archive was generated by hypermail 2.1.5 : Wed Jan 14 2009 - 22:53:11 CST