Re: 28th IUC paper - Tamil Unicode New

From: Sinnathurai Srivas (
Date: Tue Aug 23 2005 - 14:47:49 CDT

  • Next message: John Hudson: "Re: Windows Glyph Handling"

    I think PUA is used on an experimental basis and will be moved to character
    code, once approved.

    I like to hear technical views other than the temporary use of PUA, and the
    discussions on weather it will be officially encoded or not, which can be
    discussed at a later date. This will help avoid the stability trap that all
    other languages find themselves in.


    ----- Original Message -----
    From: "John Hudson" <>
    Cc: "Unicode List" <>
    Sent: Monday, August 22, 2005 10:35 PM
    Subject: Re: 28th IUC paper - Tamil Unicode New

    > Richard Wordingham wrote:
    >>> GSUB tables don't handle the reordering in Indic languages. It's the
    >>> responsibility of the OpenType Layout processor, e.g. Uniscribe.
    >> So how do I get it to live up to its 'responsibility' to support an Indic
    >> conlang living in the PUA? I'm not even sure that Burmese is supported
    >> yet.
    > ...
    >> While it clearly looks like good practice to have a single per-script
    >> definition of necessary re-orderings, in practice it is very inconvenient
    >> if the user (or system administrator) cannot update the definitions. For
    >> example, Microsoft has little incentive to modify Uniscribe to treat
    >> independent Devanagari vowels as consonants (or, to be pedantic,
    >> consonant-vowel ligatures).
    > If you want something supported, you have to take it through the standards
    > process and get it approved as part of Unicode or another standard that
    > the software company in question is committed to supporting. If the
    > behaviour you want to see for Devanagari becomes part of Unicode's
    > processing requirements for that script, then you can expect Microsoft to
    > support it.
    > A shaping engine has no 'responsibility' to support an Indic conlang
    > living in the PUA, because the shaping engine has no way of knowing that a
    > string of PUA codepoints is text in an Indic conlang.
    > The very nature of the PUA effectively makes it a dead end for most
    > language processing, unless you have a very simple script in which there
    > is a one-to-one correspondence between characters and glyphs and simple
    > sequential, left-to-right display. Shaping engines simply don't know what
    > to do when you pass them a PUA codepoint, because it could be *anything*.
    > This is why using non-standard, PUA codepoints for any language processing
    > is such a bad idea.
    > John Hudson
    > --
    > Tiro Typeworks
    > Vancouver, BC
    > Currently reading:
    > Lords of the horizons, by Jason Goodwin
    > Dining on stone, by Iain Sinclair

    This archive was generated by hypermail 2.1.5 : Tue Aug 23 2005 - 14:49:47 CDT