Re: Telugu Unicode Encoding Review

From: Doug Ewell (doug@ewellic.org)
Date: Sat Oct 23 2010 - 15:17:37 CDT

  • Next message: Doug Ewell: "Re: Telugu Unicode Encoding Review"

    Kiran Kumar Chava wrote:

    > I just couldn't understand the misery in saving two code points by
    > forcing font developers to use devanagari code points for Telugu
    > script.

    Actually it's not a matter of saving code points. The corresponding
    code points in other Indic blocks are reserved and very unlikely to be
    used for any other characters.

    It's more a matter of unifying code points that are displayed the same
    and have the same function, which is a core principle of Unicode.

    > The following are problems I am seeing
    > 1. This double danda in Telugu script is used for other purposes, like
    > abbreviations, pre and post Pallavi of a song, (see page 32 of
    > http://te.chavakiran.com/blog/?p=774 this pdf. ) ,, || 2|| to say this
    > line is to be sung twice (page 46 of
    > http://te.chavakiran.com/blog/?p=774 pdf)

    Where is the problem? U+002E FULL STOP (in the Basic Latin block) is
    used for many unrelated purposes, in many scripts besides Latin. This
    isn't a serious problem either.

    > 2. What are we going to do for sorting?

    Use the Unicode Collation Algorithm. Don't expect code point order to
    be usable for sorting, EVER. The is the very first item on the
    Collation FAQ page.

    > 3. Telugu font developers may not even know they need to insert these
    > in devangari block.

    They should probably learn this as part of their craft.

    Keep in mind that none of the issues above are unique to Telugu. Users
    of Bengali and Gurmukhi and Gujarati and Oriya and Tamil and Kannada and
    Malayalam have the exact same considerations.

    > Is there a link to previous discussion on this unification && dis
    > unification of danda signs?

    Very possibly not, since the decision goes back 20 years or more.

    --
    Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
    RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­
    


    This archive was generated by hypermail 2.1.5 : Sat Oct 23 2010 - 15:22:31 CDT