Re: Unicode and transliteration

Date: Wed Aug 25 1999 - 20:23:52 EDT on 08/23/99 11:39:50 PM

Sent by:

To: Unicode List <>
cc: (Cibu Johny/MW/US/3Com)
Subject: Re: Unicode and transliteration wrote:
> As I understand Unicode, it is trying to represent a text in its deepstructure
> and it is the job of the font to convert that deep structure to
> or actual glyphs of the text. This is what exactly transliteration alsotrying
> to do (atleast in case of Indic scripts). Finding out the rules to dothis
> conversion is the core of both. What is being remaining is, assigningcode
> numbers in case of Unicode or assigning correpsonding Latin charactersequence
> in case of transliteration. Both are reasonably trivial. So my questionsare:
> 1. Is my theory correct ? If not, in which way ?
> 2. Are these rules for conversion between deep structure to surfacestructure
> documented somewhere, in case of Malayalam ?

However you look at it, this is beyond what the Unicode Standard provides. If
want character encodings, Unicode provides this; for visual elements and the
behavior of your cursor keys, CDAC's implementation in Leap is probably
the de facto standard you should follow. I believe this is well documented
in their manuals.

====> Thanks for your kind reply.

    My question was not based on any implementations or other standards.

    Given an encoding for a language, a Unicode font should be able to reproduce
 all the glyphs
    existing in that language. As all you know, it is not that intuitive. For
example, Malayalam
    'ra'(0D31) can take at least three glyph forms depending on the context.
Whether all these
    glyphs are reproducable from a Unicode font should be a concern. For this,
    should be defined, describing such and such combination produces such and
such glyph. As an
    example, <ra> + <zwj> produces 'cillu' glyph of 'ra'. Is such a rule set
existing ?


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT