Re: New Canonical Decompositions to Non-Starters

From: Asmus Freytag <>
Date: Sun, 17 Feb 2013 10:12:26 -0800

On 2/17/2013 8:20 AM, Richard Wordingham wrote:
> Is there any guarantee that U+E4567 will not have a
> canonical decomposition mapping to <U+0F73 TIBETAN VOWEL SIGN II,
> U+E4568>? If so, where is it published? I thought we had guarantees
> that new canonical decompositions to non-starters would not be created
> (to <U+0F71, U+0F72, U+E4568> in this case), but I cannot find it. This
> conceivable decomposition mapping appears to wriggle through a
> loophole because U+0F73 is a starter, i.e. has canonical combining
> class 0.
> Richard.
Let me see whether I follow that.

If you encode a new character, it can have decomposition only if that
decomposition also contains at least one new character. Otherwise, you
might have existing data that contains that decomposition but wasn't
previously normalizable with NFC (and now would be).

Now, does it make a difference whether that required new character in
the decomposition is the first or the second? (Remember, all
decompositions are defined to be pairs, except when they are singletons.
If a one-t0-many mapping is desired, enough intermediate, partially
composed characters must exist to allow this longer mapping to be
represented as a chain of simpler mappings.) And if it does, can one
point to a stability guarantee where that is expressed?

Is that what you are asking?

Received on Sun Feb 17 2013 - 12:18:34 CST

This archive was generated by hypermail 2.2.0 : Sun Feb 17 2013 - 12:18:39 CST