Re: 28th IUC paper - Tamil Unicode New

From: John H. Jenkins (jenkins@apple.com)
Date: Tue Aug 23 2005 - 17:13:03 CDT

  • Next message: John Hudson: "Re: Windows Glyph Handling"

    It's not at all likely to be approved. Unicode already has a Tamil
    encoding model which -- however flawed it may be in its theoretical
    grounding -- works and is supported on major platforms.

    As Ken notes, adding a second full Tamil encoding model to the
    standard would create a number of interoperability problems. Unless
    it can be shown to have significant *technical* advantages (meaning
    something other than it makes it possible to render Tamil without an
    advanced shaping engine), it's not likely to succeed.

    On Aug 23, 2005, at 1:47 PM, Sinnathurai Srivas wrote:

    > I think PUA is used on an experimental basis and will be moved to
    > character code, once approved.
    >
    > I like to hear technical views other than the temporary use of PUA,
    > and the discussions on weather it will be officially encoded or
    > not, which can be discussed at a later date. This will help avoid
    > the stability trap that all other languages find themselves in.
    >
    > Sinnathurai
    >
    >
    > ----- Original Message ----- From: "John Hudson" <tiro@tiro.com>
    > Cc: "Unicode List" <unicode@unicode.org>
    > Sent: Monday, August 22, 2005 10:35 PM
    > Subject: Re: 28th IUC paper - Tamil Unicode New
    >
    >
    >
    >> Richard Wordingham wrote:
    >>
    >>
    >>>> GSUB tables don't handle the reordering in Indic languages. It's
    >>>> the responsibility of the OpenType Layout processor, e.g.
    >>>> Uniscribe.
    >>>>
    >>
    >>
    >>> So how do I get it to live up to its 'responsibility' to support
    >>> an Indic conlang living in the PUA? I'm not even sure that
    >>> Burmese is supported yet.
    >>>
    >> ...
    >>
    >>
    >>> While it clearly looks like good practice to have a single per-
    >>> script definition of necessary re-orderings, in practice it is
    >>> very inconvenient if the user (or system administrator) cannot
    >>> update the definitions. For example, Microsoft has little
    >>> incentive to modify Uniscribe to treat independent Devanagari
    >>> vowels as consonants (or, to be pedantic, consonant-vowel
    >>> ligatures).
    >>>
    >>
    >> If you want something supported, you have to take it through the
    >> standards process and get it approved as part of Unicode or
    >> another standard that the software company in question is
    >> committed to supporting. If the behaviour you want to see for
    >> Devanagari becomes part of Unicode's processing requirements for
    >> that script, then you can expect Microsoft to support it.
    >>
    >>
    >> A shaping engine has no 'responsibility' to support an Indic
    >> conlang living in the PUA, because the shaping engine has no way
    >> of knowing that a string of PUA codepoints is text in an Indic
    >> conlang.
    >>
    >> The very nature of the PUA effectively makes it a dead end for
    >> most language processing, unless you have a very simple script in
    >> which there is a one-to-one correspondence between characters and
    >> glyphs and simple sequential, left-to-right display. Shaping
    >> engines simply don't know what to do when you pass them a PUA
    >> codepoint, because it could be *anything*. This is why using non-
    >> standard, PUA codepoints for any language processing is such a bad
    >> idea.
    >>
    >> John Hudson
    >>
    >> --
    >>
    >> Tiro Typeworks www.tiro.com
    >> Vancouver, BC tiro@tiro.com
    >>
    >> Currently reading:
    >> Lords of the horizons, by Jason Goodwin
    >> Dining on stone, by Iain Sinclair
    >>
    >>
    >
    >
    >

    ========
    John H. Jenkins
    jenkins@apple.com
    jhjenkins@mac.com
    http://homepage.mac.com/jhjenkins/



    This archive was generated by hypermail 2.1.5 : Tue Aug 23 2005 - 17:13:48 CDT