From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Fri Mar 26 2004 - 15:06:38 EST
Philippe Verdy va escriure:
> Space is a base character, then it combines with the next diacritic
> with which it creates a "default grapheme cluster" which should be
> interpreted as if it was a single character identity.
Agreed so far for diacritics. Agreed also for non-spacing dependent vowels
like U+0BC0. Agreed for the special exceptions like u+0BBE. I disagree for
U+093F or U+0BBF (Mc not included in Other_Grapheme_Extend, there is an
allowed break before it), until there is something I missed here.
> It is NOT defective.
I do not understand. I did say anything implying that, did I? I just
remarked that I was not able to fetch in the text of the standard some words
to require from vendors and implementers (like I am) solid base to make them
modify their engines to provide special exceptions to deal with the
combination U+0020/U+00A0 then U+093F.
And no, this is not the same as displaying a diacritic, because it should be
re-ordered, rather than being a "spacing representation of diacritics".
> Now how would you interpret differently SPACE+diacritic or
> SPACE+vowel sign?
> If you display a dotted circle there, then you'll
> display two separate glyphs for a single grapheme cluster, and this
> is not intended by the normal Unicode character model.
How do you believe anybody will show say u+0063 u+0300? Which font have this
as a single glyph?
Furthermore, a single character like U+0916 (Devanagari KHA) is very often
rendered with two glyphs (namely, Half-Kha then the glyph also used for the
AA-matra, U+093E). Unicode does not enter into knowing how does this stuff
This archive was generated by hypermail 2.1.5 : Fri Mar 26 2004 - 16:57:16 EST