Re: 28th IUC paper - Tamil Unicode New

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Mon Aug 22 2005 - 18:34:49 CDT

  • Next message: Richard Wordingham: "Re: 28th IUC paper - Tamil Unicode New"

    Kenneth Whistler wrote:

    > This is getting really off-track.
    >
    >> Surely the whole point of TUNE is that it work with basic Unicode
    >> support,
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > That translates to: "it can be displayed with a dumb rendering engine
    > and a simple font".

    Largely, yes. I suspect the default Unicode collation would also produce
    the correct results.

    >> without any awareness of Tamil as a distinct script. Having a special
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > In fact adding TUNE to Unicode "without any awareness of Tamil
    > as a distinct script" is a recipe for disaster.

    It junks data in the current encoding. How else is it a recipe for
    disaster?

    > Searches and matches will fail [on the standard Tamil encoding]. The only
    > way it comes close
    > to being viable would be to treat it like the Hangul multiple
    > representations in the standard: you have to make the software
    > *aware* of the Tamil script to establish the equivalences between the
    > existing Tamil encoding and the TUNE encoding.

    Are such canonical equivalences now permitted? I suppose they could be made
    equivalent in the default Unicode collation algorithm. Another, nasty
    issue, is that if they were canonically equivalent, conversion from TUNE
    characters to NFD (thus current Tamil) would make text dependent on
    sophisticated rendering, and defeat a large part of the point of TUNE.

    > Encoding TUNE, whether in the PUA or elsewhere, *without any
    > awareness of Tamil as a distinct script*, defeats the purpose
    > of an encoding in the first place.

    Please enlighten me. What's fundamentally wrong with having LATIN LETTER
    TAMIL K, LATIN LETTER TAMIL KA, etc? I thought scripts were chiefly
    relevant in Unicode because characters in the same script tend to have
    similar properties and have to work together.

    Richard.



    This archive was generated by hypermail 2.1.5 : Mon Aug 22 2005 - 18:36:31 CDT