From: Asmus Freytag (firstname.lastname@example.org)
Date: Sun Jan 04 2009 - 19:03:22 CST
On 1/3/2009 10:45 PM, Doug Ewell wrote:
> Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
>>> Seems to me that "compatibility characters" means whatever you want
>>> it to mean at a given moment.
>> I simply follow the definition. See, for example the glossary:
>> "/Compatibility Character. /
>> A character that would not have been encoded except for compatibility
>> and round-trip convertibility with other standards"
> This definition also appears in Section 2.3 (p. 23) of TUS 5.0, but
> the *very next sentence* says:
> "They are variants of characters that already have encodings as normal
> (that is, non-compatibility) characters in the Unicode Standard; as
> such, they are more properly referred to as compatibility variants."
> Now what?
It's an attempt to separate the two facets of compatibility: One is
based on interoperability needs being the primary base for encoding the
character. The other is based on a character having a compatibility
decomposition. The latter are the ones that could be called
"compatibility variants", because they can be considered a variant of an
existing (ordinary) character.
(In discussions like this, I personally prefer the term "ordinary" in
place of the more cumbersome circumlocution "normal (that is,
It should be immediately obvious, that not all characters needed for
interoperability (compatibility) can be guaranteed to have an ordinary
character counterpart. Therefore, some characters that look like
ordinary characters (because they don't have a compatibility
decomposition) are in fact encoded for compatibility.
To make matters slightly more complicated, a huge number of characters
that have compatibility decompositions represent both semantic as well
as glyphic variation from a corresponding ordinary character. In
notational systems where the glyphic variation applies to single
characters, these act like ordinary characters. In texts not using such
notational systems and where these variations apply to entire text runs,
they should not be used, but can be replaced by the ordinary character
plus style markup in rich text. (Examples include the phonetic
characters, math alphanumerics etc.).
The set of emoji (and also emoticons) are composed of many ordinary
characters (straightforward symbols), plus compatibility characters that
do not have a decomposition.
Finally, I don't always distinguish between characters that should be
"candidates for encoding as compatibility characters" and "compatibility
characters" (which implies that they are actually encoded already).
That's because it's usually clear from context whether characters are
being proposed or are already encoded. This is just a matter of avoiding
cumbersome expressions that are redundant from context - it does *not*
imply that I consider these "as good as encoded". Just thought I might
clarify that, while we are at it.
>> What is it with you people? Everything apparently must be black or
>> white. Character coding is an exercise in dealing with shades of gray
>> and edge cases.
> At least now when I see a black-and-white statement such as "Unicode
> does not encode idiosyncratic, personal, novel, or private-use
> characters, nor does it encode logos or graphics," I know how to
> interpret it.
Yes, "graphics" is not a very well-defined term ;-) And "novel" would
have encompassed the Euro sign before 2002, yet it was coded well in
advance of the actual introduction of that currency.
> I've been a huge and vocal supporter of the Unicode Standard for the
> past 16 years, back before most people had heard of it, and this is by
> far the most disappointed I have ever been in the Standard. This
> decision will come back to haunt Unicode again and again.
First, there hasn't been a decision. Certainly not a final one. So it's
a bit premature to express things this way.
Second, if you've been around that long, you might have heard about
similar discussions where people were predicting bad outcomes from
certain decisions. Surprisingly enough, things didn't always turn out as
badly as predicted. Some issues, after being hotly contested and taking
truly enormous bandwidth in the committee, and on the lists, have sunk
out of sight without a trace, the minute they were decided (and seem to
have had no observable impact on the standard). Astonishing, but true.
Third, I really hope that no single issue can affect your support for
the standard, if it's sustained you for 16 years so far.
This archive was generated by hypermail 2.1.5 : Sun Jan 04 2009 - 19:07:58 CST