Re: 28th IUC paper - Tamil Unicode New

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Aug 22 2005 - 14:35:32 CDT

  • Next message: Richard Wordingham: "Re: 28th IUC paper - Tamil Unicode New"

    From: "Abhijit Dutta अभिजीत दत्ता" <dabhijit@in.ibm.com>
    > The Ministry of information technology, Govt., of India is
    > distributing free CDs with a lot of Tamil software. The CD has
    > about 50 fonts which uses the alternative scheme. It is proposed
    > to talk to the developers of the Tamil Open Office to use the font
    > with the alternate scheme. A representation will be sent to major
    > software vendors incuding Microsoft to use the new scheme. All
    > members of KTS have agreed to use the new scheme in their software."

    How can all software vendors agree to use the PUA scheme? If it was so, then
    this PUA block would become permanently bound to the "New Tamil" encoding
    scheme, meaning that the purpose of PUAs would be defeated. Using the PUAs
    not only requires an agreement with the software vendor, but also with the
    effective users of this scheme.

    Nevertheless, I approve such initiative when it helps creating a stable
    model for representing Modern Tamil. But this won't have any success if the
    PUA scheme is not also strictly bound to standard Unicode/ISO 10646-1 code
    points, using an unambiguous mapping that will work bijectively at least for
    the subset of Modern Tamil texts representable with this PUA scheme; with
    that mapping, it will in fact be easier to interchange the represented
    texts, by remapping the PUA-encoded texts to standard Unicode, so that PUA
    agreements will no more be needed.

    Such scheme will also help fixing the various fonts so that they will
    support correctly at least the subset of texts representable with the "New
    Tamil" PUA scheme. But this does not require that fonts be prepared to
    support these PUAs. I think it will be much more productive to create
    OpenType fonts using the standard Unicode codepoints, and a well-defined set
    of GSUB/GPOS tables. This way, these fonts will be also usable
    interoperably.

    Such encoding scheme can then be viewed as a way to certify a minimum
    Level-1 compatibility requirement for Tamil fonts. Note that the encoding
    scheme could be even more easily fixed if the Tamil State government and Sri
    Lanka agreed to define an effective TSCII standard, which would reproduce
    the unambiguous mapping to the standard Unicode code points, as well as the
    unambiguous collation order, easily implemented with such simplified scheme.

    For now, you can't say that TSCII is not a standard, as it has too many
    unstable variants, and it is not approved at least by a national standard
    authority, and implemented effectively by major software vendors (a step
    required before going to an internationaly accepted standard like GB18030 or
    ISO 8859); I better see TSCII as an attempt to create a stable processing
    *model*, rather than an effective encoding.

    More generally, a charset registered by a national standard authority in the
    IANA charsets registry would work more successfully and more reliably than a
    system based on private agreements on PUAs (simply because charsets can be
    easily transported in MIME, unlike PUA agreements), and also because the
    solutions to support other charsets than UTF's already exists and well
    implemented an deployed, and also because charsets work reliably with
    Unicode/ISO 10646 as well.



    This archive was generated by hypermail 2.1.5 : Mon Aug 22 2005 - 14:36:56 CDT