RE: Never say never

From: Andy White (
Date: Tue Feb 11 2003 - 21:05:33 EST

  • Next message: Michael Everson: "RE: Never say never"

    Thank you for the reply.

    Given this information, I wonder if anyone can tell me why the 'Bengali
    letter AE' and 'Bengali Letter EA' were never included in the UCS?
    (I am talking about the letters mentioned in the Unicode Indic FAQ,



    > -----Original Message-----
    > From:
    > [] On Behalf Of Kenneth Whistler
    > Sent: 12 February 2003 01:38
    > To:
    > Cc:;
    > Subject: RE: Never say never
    > Andy White wrote:
    > > And I today see that the precomposed character '0B71 ORIYA
    > LETTER WA'
    > > has been added to the UCS4.0 charts
    > >
    > > This is clearly a composition of ORIYA LETTER O and ORIYA LETTER
    > > LETTER VA (BA).
    > People on the list today are playing a little fast and loose
    > with the terminology of "precomposed" and "composition".
    > In the Unicode Standard, a character is not precomposed or
    > composite unless it has a formal decomposition mapping
    > defined in the Unicode Character Database (namely in UnicodeData.txt).
    > While ORIYA LETTER WA is graphically constructed of the
    > form for the ORIYA LETTER O and the bottom half of PA (not
    > BA), it doesn't fit the pattern one would expect for
    > consonant conjuncts (C+C, not V+C), and it isn't given a
    > formal decomposition in UnicodeData.txt, because even though
    > it is graphically complex, it otherwise fits into the pattern
    > of the regular consonant letters for Indic scripts (as an
    > alternate for VA). Note that the new ORIYA LETTER VA is also
    > graphically complex -- a dotted BA -- but is also not given a
    > decomposition.
    > For that matter, you could look to existing Oriya characters
    > such as U+0B06 ORIYA LETTER AA and claim it is just a graphic
    > combination of U+0B05 ORIYA LETTER A and U+0B3E ORIYA VOWEL
    > SIGN AA. But such decompositions are *also* not used in the
    > standard. So ORIYA LETTER AA is an *atomic* character in
    > Unicode, despite the fact that it is graphically complex (and
    > analyzable into parts).
    > If anyone ones a pointless exercise in simplification for
    > the benefit of complexity sometime, try working on the
    > Yi syllabary charts (U+A000..U+A48C) and pull these
    > graphically complex forms apart into all of their
    > duplicated constituent parts. The mere fact that such
    > forms are graphically complex and have identifiable parts
    > is not what establishes, however, their status as atomic
    > versus composite character in the Unicode Standard.
    > --Ken

    This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 21:40:11 EST