Hello,
I have some problems regarding the decomposition of characters:
1. The standard contains a character block for CJK compatibility
ideographs that are used in other standards for indicating
alternative pronounciations. The character block name indicates
that these characters are compatibility equivalent to their
counterparts in the CJK unified ideographs block. However, as
they are only alternative pronounciation for the same character
they may also be considered as canonical equivalent. I could
not figure out from the standard which of these two variants is
intended.
2. There are decomposition rules for hangul syllables into hangul
jamos. Is this decomposition considered to be a canonical
decomposition, a compatibility decomposition, or does it have to
be considered as a different kind of decomposition?
3. It is possible to build a character "latin capital letter DZ with
diaeresis" by building the sequence
"latin capital letter DZ"; "combining accute accent"
which would be rendered as a DZ with a diaresis over DZ placed in
the middle.
Applying the compatibility decomposition of "latin capital letter DZ"
results in the sequence
"latin capital letter D"; "latin capital letter Z";
"combining accute accent"
which would be rendered as a D followed by a Z with an acute only
on top of Z.
I would consider these two renderings as having different semantics.
This means that a compatibility decomposition changes more than
just formatting informations (though one might argue that a
DZ with a ACUTE above is not a character used in any language; but
then consider the appearance of new languages ...).
-- Michael Mehlich email: mmehlich@semdesigns.com Semantic Designs, Inc. voice: 512-250-1018 12636 Research Blvd. #C-214 fax: 512-250-1191 Austin, Texas 78759 www: http://www.semdesigns.com
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT