L2/07-153 Subject: Compatibility decomposition related descriptions at UAX #15 Date: May 7'th, 2007 From: Ienup Sung 1. Issue: The revision 27 and also proposed update to UAX #15 (revision 28) has the following canonical decomposition and compatibility decomposition descriptions at the section 10: Canonical decomposition is the process of taking a string, recursively replacing composite characters using the Unicode canonical decomposition mappings (including the algorithmic Hangul canonical decomposition mappings; see Section 16, Hangul), and putting the result in canonical order. Compatibility decomposition is the process of taking a string, replacing composite characters using both the Unicode canonical decomposition mappings and the Unicode compatibility decomposition mappings, and putting the result in canonical order. I think the compatibility decomposition definition at the above is a possible source of confusion for not so careful readers since the above definition differs from the D65 (or D20 of older versions): D65 Compatibility decomposition: The decomposition of a character that results from recursively applying both the compatibility mappings and the canonical mappings found in the Unicode Character Database, and those described in Section 3.12, Conjoining Jamo Behavior, until no characters can be further decomposed, and then reordering nonspacing marks according to Section 3.11, Canonical Ordering Behavior. This possible confusion is further amplified by the following sentence at the sub-section "Hangul Composition" at the section 16 since there is no mentioning of NFKD: Notice an important feature of Hangul composition: whenever the source string is not in Normalization Form D, one cannot just detect character sequences of the form and . 2. Proposal: I'd like to propose to change the UAX #15 text for the compatibility decomposition description at the section 10 into something like: Compatibility decomposition is the process of taking a string, recursively replacing composite characters using both the Unicode | canonical decomposition mappings (including the algorithmic Hangul | canonical decomposition mappings) and the Unicode compatibility | decomposition mappings, and putting the result in canonical order. and the sentence at the sub-section "Hangul Composition" at the section 16 to something like the following: Notice an important feature of Hangul composition: whenever the source string is not in Normalization Form D or Normalization Form KD, | one cannot just detect character sequences of the form and . END_OF_MEMO.