RE: Reph and Khmer encoding model

From: Mijan (mijan@bangladesh.net)
Date: Wed Mar 05 2003 - 11:26:05 EST

  • Next message: John Cowan: "Re: Caron / Hacek?"

    Quoting Kent Karlsson <kentk@md.chalmers.se>:

    >
    > > I understand that unicode is supposed to represent the
    > > language, not the way it is written.
    >
    > No, Unicode is supposed to be able to represent the written
    > form. (Of course.)

    Yes, I was wrong! I think I wanted to say something like, "Unicode is supposed
    to be able to represent the written language with logicaly equivalent code
    points".
    (Because the argument is, what is logicaly equivalent to ya-phalaa)

    Mijan

    > form
    > ...
    > > Let's consider the ra+virama+ya case. In the mostpart the
    > > ra+virama+ya is
    > > displayed as ya+reph. This obviously seems to be an
    > > instance of ambiguous interpretation because ra+virama+ya
    > > could also represents
    > > ra+ja-phalaa. ya+reph and ra+ja-phalaa are used in different
    > > words and have
    > > different meaning.
    > > Form this you see that ja-phalaa is not equivalent to
    > > virama-ya and is better
    > > as a separate letter in Unicode. We always thought of
    > > ya-phalaa as separate
    > > anyway.
    >
    >
    > > > >3. There are no other cases of a Vowel+Virama combination in the
    > > > >Unicode encoding model.
    > > >
    > > > Yes, there are. Khmer.
    > >
    > > I do not understand Khmer but I see that it does not use the
    > > same 'encoding
    > > model'. Please look, you will see that you were wrong to use
    > > Khmer as an example.
    >
    > Khmer uses the same encoding model as most other Indic scripts,
    > except for one point: the "reph" is represented via a combining
    > character (which also means that it does not come in "logical order"
    > in the text representation), so the ambiguity you refer to does
    > not exist for Khmer. Further, Khmer could have been represented
    > in a "Tibetan-like" encoding model (but isn't). Further, IIRC,
    > independent vowels can both be subscripted (before virama/coeng)
    > and be subscripts (after virama/coeng) in Khmer. The latter is
    > orthographically different from using dependent vowels.
    >
    > /kent k
    >

    -------------------------------------------------
    This mail sent through http://www.bangladesh.net



    This archive was generated by hypermail 2.1.5 : Wed Mar 05 2003 - 12:09:04 EST