some thoughts on encoding Indic fractions etc.

From: N. Ganesan (naa.ganesan@gmail.com)
Date: Wed May 10 2006 - 06:06:40 CDT

  • Next message: SADAHIRO Tomoyuki: "Re: PRI#86 Update"

    Reading the Unicode 5.0 charts:
    http://www.unicode.org/Public/5.0.0/charts/CodeCharts-5.0.0d1.pdf

    http://std.dkuug.dk/JTC1/SC2/WG2/docs/N3059.pdf

    It is great that Balinese letters, a Pallava descendent, are getting
    encoded.

    (a)
    Malayalam zero glyph is changing for good. Why is it not shown within
    an yellow block? Is it because it's just a glyph change?

    (b)
    Very good to see Saurashtran script getting encoded.
    This is used originally by a community of silk sari weavers
    centered around Madurai, Tamil Nadu, India. In Saurashtran
    script we can see the impact Nagari as well as Tamil scripts'
    principles. I agree with Prof. Peri. Bhaskararao's comments
    on the need for positioning of Saurashtra letters in code chart
    in parallel spots with major Indic scripts such as Devanagari,
    Kannada, Tamil, Bangla, ....:
    http://std.dkuug.dk/JTC1/sc2/WG2/docs/n2620.pdf
    After all there are only very few actual users of this script,
    and even Tamil whose collation order gave birth to Dravidian
    Etymological Dictionary (DED) order is given a Sanskrit-type
    model in Unicode. May be Saurashtran script is at an advanced
    stage of encoding - it is just too difficult. But at least in the future hope
    UTC considers the Brahmic harmonization of them with
    major Indic scripts' model in place in Unicode.
    Even Sinhala and Burmese scripts which are outside India proper
    have been mentioned for Brahmic harmonization:
    http://www.egt.ie/standards/si/si.html
    http://www.evertype.com/standards/my/my.html
    Atleast within modern India itself, Brahmic harmonization in the
    future will be useful because all major scripts with potential
    users running 100+ million users are enocoded in Unicode that way.

    (c)
    Also, saw 3 Malayalam fractions encoded. Even though they are
    not used, fractions can be encoded for south Indian scripts.
    The problem is that because there are lots more of fractions'
    symbols than just 1/4, 1/2 and 3/4 getting encoded in Malayalam.
    Telugu, Kannada will have many fractions symbols too.
    Tamil has some 25 fractions, and a host of other symbols.
    Malayalam has a bunch of fractions as well, not just these 3.
    The problem is if Unicode encoded just the 3 fractions
    for Malayalam, the rest of fractions' symbols will not be
    in ascending or descending order at all, there will be gaps.

    So, a better solution will be to move the South Indian scripts'
    fractions and symbols to a separate "South Indian Symbols"
    page - Khmer symbols code-page is different from Khmer akshara-letters
    code page:
    http://www.unicode.org/charts/PDF/U19E0.pdf

    If a separate "South Indian Symbols" page is allocated,
    there we can encode Malayalam, Tamil, Kannada, Telugu, Oriya, ....
    fractions in a continuous way - in ascending or descending order.
    We can supply more Malayalam fractions, for example.

    Comments are appreciated.

    N. Ganesan



    This archive was generated by hypermail 2.1.5 : Wed May 10 2006 - 06:09:37 CDT