RE: Decomposition vs Full decomposition?

From: Peter Constable (petercon@microsoft.com)
Date: Tue Mar 15 2005 - 15:16:02 CST

  • Next message: Michael Everson: "RE: Decomposition vs Full decomposition?"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of Deborah Goldsmith

    > A frequent, erroneous assumption I see... [is] that NFD will never
    contain
    > precomposed characters (that is, base character + diacritic(s) in one
    > character).

    It does, however, mean no characters with a canonical decomposition
    mapping. Deborah is right in saying there's a difference between these
    things.

    Not having a canonical decomposition mapping will mean e.g. you won't
    have pre-composed A-acute, but NFD would still have things like the
    following, which have compatibility decompositions,

    U+0132 LATIN CAPITAL LIGATURE IJ
    U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT
    U+1E9A LATIN SMALL LETTER A WITH RIGHT HALF RING
    U+0677 ARABIC LETTER U WITH HAMZA ABOVE

    as well as things that have diacritic marks but no decomposition, such
    as

    U+048A CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
    U+0681 ARABIC LETTER HAH WITH HAMZA ABOVE

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Tue Mar 15 2005 - 15:17:24 CST