Major Defect in Combining Classes of Tibetan Vowels

From: Christopher John Fynn (cfynn@gmx.net)
Date: Sat Jun 21 2003 - 21:23:17 EDT

  • Next message: Philippe Verdy: "Re: Major Defect in Combining classes of Tibetan Vowels"

    In Unicode's UnicodeData.txt (
     http://www.unicode.org/Public/UNIDATA/Unicodea.Dattxt )
     0F7E has a Canonical Combining Class Value (CCCV) of 0;
     0F71 a CCCV of 129;
     0F72 0F7A 0F7B 0F7C 0F7D and 0F80 a CCCV of 130;
     0F74 a CCCV of 132;
     and 0F82 and 0F83 have a CCCV of 230.

     By normal Tibetan & Dzongkha spelling, writing, and input rules
     Tibetan script stacks should be entered and written: 1 headline
     consonant (0F40-0F6A), any subjoined consonant(s) (0F90-
     0F9C), achung (0F71), shabkyu (0F74), any above headline
     vowel(s) (0F72 0F7A 0F7B 0F7C 0F7D and 0F80) ; any ngaro (0F7E,
     0F82 and 0F83)

     So following normal Tibetan & Dzongkha input and spelling rules
     the relative ordering of these characters should be:
     A. 0F71
     B. 0F74
     C. 0F72 0F7A 0F7B 0F7C 0F7D and 0F80
     D. 0F7E, 0F82 and 0F83

     The fact that, in a process of "canonical decomposition" or
     "normalisation", these combining characters can get reordered
     in a bizarre order relative to each other causes difficulties
     with culturally correct collation (where 0F7E, 0F82 and 0F83
     should have an equal value) - and especially it necessitates
     making lookups in smart fonts far more complex and inefficient
     than they should have to be.

     (In Tibetan script fonts 0F71 and 0F74 are often ligated with
     preceding consonant (+ subjoined consonants) combined as a
     single glyph whereas above headline vowels are almost always
     treated as non spacing combining marks.)

     Currently there seems to be no easy or standardized work around
     for these problems and the standard seems to say that the
     relative values of assigned Canonical Combining Class Values
     cannot be changed.

     Any suggestions as to how to create a standardized work around
     for these incorrect values?

     - Chris



    This archive was generated by hypermail 2.1.5 : Sat Jun 21 2003 - 21:58:04 EDT