From: Peter Constable (petercon@microsoft.com)
Date: Tue Mar 15 2005 - 15:16:02 CST
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
On Behalf
> Of Deborah Goldsmith
> A frequent, erroneous assumption I see... [is] that NFD will never
contain
> precomposed characters (that is, base character + diacritic(s) in one
> character).
It does, however, mean no characters with a canonical decomposition
mapping. Deborah is right in saying there's a difference between these
things.
Not having a canonical decomposition mapping will mean e.g. you won't
have pre-composed A-acute, but NFD would still have things like the
following, which have compatibility decompositions,
U+0132 LATIN CAPITAL LIGATURE IJ
U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT
U+1E9A LATIN SMALL LETTER A WITH RIGHT HALF RING
U+0677 ARABIC LETTER U WITH HAMZA ABOVE
as well as things that have diacritic marks but no decomposition, such
as
U+048A CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
U+0681 ARABIC LETTER HAH WITH HAMZA ABOVE
Peter Constable
This archive was generated by hypermail 2.1.5 : Tue Mar 15 2005 - 15:17:24 CST