From: Theodore H. Smith (firstname.lastname@example.org)
Date: Tue Mar 15 2005 - 07:03:44 CST
I've been struggling to understand what the UCD.html file means, when
it talks about decompositions and full decompositions, and combining
It's quite wordy, and uses ambiguous terms (to the outsider). While I
appreciate that the terms you use are actually precise, to an outsider,
"full decomposition" and "decomposition" don't really bear much
So heres my questions.
1) Is "full decomposition" the same as "normalisation"?
2) Is normalisation differing from decompostion, by the ordering of
combining chars? IE, if I had a "double dots above the letter" (like
ü), and "Squiggly thing below the letter" (like Ç), both on the same
letter, then I suppose they should take on a certain ordering, correct?
One of the combiners should always come before the other one.
Is that what makes normalisation differ from decomposition?
3) I remember someone mentioning some special cases for normalisation,
that aren't included in UnicodeData.txt. But I don't remember what
these special cases were. Where do I read about them?
I do find the Unicode information quite heavy going and complex... but
that is the way with technical standards, XML wasn't much bettr. The
writers of technical standards rarely seem to have the talents that a
writer of books like "C in Plain English" has, or the O'Reilly in a
-- Theodore H. Smith - www.elfdata.com/plugin/ - www.elfdata.com/forum/ ElfData: Industrial strength string processing, made easy. "All things are logical. Putting free-will in the slot for premises in a logical system, makes all of life both understandable, and free."
This archive was generated by hypermail 2.1.5 : Tue Mar 15 2005 - 07:06:44 CST