Re: NFD on u+AC00 contradicts NormalisationData.txt ?

From: Kenneth Whistler (
Date: Wed Jun 14 2006 - 17:17:11 CDT

  • Next message: Mike: "Re: What is a Jamo, and why is it staring at me?"

    Theodore H. Smith continued:

    > > Theodore H. Smith wrote:
    > >> Does AC00 actually decompose?
    > > Yes. See TUS section 3.12 "Conjoining Jamo Behavior", <http://
    > >>.
    > But why isn't it listed in UnicodeData.txt?

    TUS 4.0, p. 72:

    D23 Canonical decomposition: The decomposition of a character
    that results from recursively applying the canonical mappings
    found in the names list of Section 16.1, Character Names List,
    and those described in Section 3.12, Conjoining Jamo Behavior,
    until no characters can be further decomposed, and then
    reordering nonspacing marks according to Section 3.11,
    Canonical Ordering Behavior.

    TUS 4.0, p. 418:

    A character names list is not provided for characters in
    the Hangul Syllables block, U+AC00..U+D7AF, because the
    name of a Hangul syllable can be determined by algorithm
    as described in Section 3.12, Conjoining Jamo Behavior.


    AC00;<Hangul Syllable, First>;Lo;0;L;;;;;N;;;;;
    D7A3;<Hangul Syllable, Last>;Lo;0;L;;;;;N;;;;;

    Those entries indicate the beginning and end range of
    the Hangul syllables, rather than listing 11,172 Hangul
    syllables, all of which have names and decompositions derivable
    by algorithm. (see above)


    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 17:25:08 CDT