Re: Arabic Normalization chart

From: Maha Hassan (maha.hassan96@yahoo.com)
Date: Fri May 09 2008 - 18:41:52 CDT

  • Next message: Kenneth Whistler: "Re: Arabic Normalization chart"

    Thanks for the references. But, why U+06C7 has no decomposition? I can enter from Arabic keyboard U+0648\U+0619 and get the exact glyph in U+06C7.  How come u+0623 has a decomposition and not U+06C7? What the criteria? Thanks, Maha ----- Original Message ---- From: Kenneth Whistler <kenw@sybase.com> To: maha.hassan96@yahoo.com Cc: unicode@unicode.org Sent: Friday, May 9, 2008 2:45:54 PM Subject: Re: Arabic Normalization chart > I am trying to understand the normalization chart for Arabic. > Why there are certain glyphs are not decomposed entirely under KD, for example: > \FBF0 ==> has KD = \064A\0654\06C7 instead of =\064A\0654\0648\0619 > \FBDB ==> KD= \06c8 instead of  =\0648\0670 > am I missing something? Yes. U+06C7 and U+06C8 have no decompositions. 06C7;ARABIC LETTER U;Lo;0;AL;;;;;N;ARABIC LETTER WAW WITH DAMMAH;;;;                             ^^ 06C8;ARABIC LETTER YU;Lo;0;AL;;;;;N;ARABIC LETTER WAW WITH ALEF ABOVE;;;;                             ^^ You cannot infer formal decompositions for letters -- particularly for Arabic -- simply by looking at the characters in the chart. To get the normative decomposition status of any particular character (which determines what its NFD or NFKD or NFC or NFKC normalizations will be), you have to look at the decomposition field in UnicodeData.txt (or check in NormalizationTest.txt) --Ken ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ



    This archive was generated by hypermail 2.1.5 : Fri May 09 2008 - 18:45:04 CDT