Re: Ambiquous compositions

From: Peter_Constable@sil.org
Date: Thu Dec 21 2000 - 12:20:51 EST


On 12/21/2000 02:19:55 AM "Mike Lischke" wrote:

>Hi all,
>
>while working on my Unicode library I found values in the Unicode database

>which have the same
>(single value) decomposition, e.g.:
>
>F907;CJK COMPATIBILITY IDEOGRAPH-F907;Lo;0;L;9F9C;;;;N;;;;;
>F908;CJK COMPATIBILITY IDEOGRAPH-F908;Lo;0;L;9F9C;;;;N;;;;;
>
>Neither of those values is in the composition exclusion list so I wonder
how to
>find a correct
>reverse mapping to create the composite from 9F9C? Can somebody help
please?

Technically, they do not need to be listed in the composition exclusion
list: all singleton decomposition mappings are, by definition, excluded
from composition. Read UTR 15 for the details.

>Btw: the values above have "compatibility" in their names. Does this mean
they
>represent a
>compatibility mapping although there is no compatibility tag in the
>decomposition section?

Wierd, eh. Because they do not have a compatibility tag <...>, they are not
compatiblity decompositions but rather canonical decompositions, in spite
of the name. The net effect is the same, though: these characters are not
essential to encode all of the semantic distinctions one might need (except
when mapping to legacy-encoded data is involved).

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT