Encoding Tamil SRI

From: Peter Jacobi (peter_jacobi@gmx.net)
Date: Sat Nov 01 2003 - 06:30:12 EST

  • Next message: Michael Everson: "Re: UTF-9"

    Dear List Members,

    I'm looking for enlightment, how to best (or least bad) encode Tamil SRI
    in Unicode. The glyph can be seen as codepoint 0x82 of TSCII 1.7 at
    http://www.tamil.net/tscii/charset17.gif

    The transcoding tables I found, especially the GNU libc
    iconv implementation
    at:

    http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/libc/iconvdata/TSCII.precomposed?rev=1.1&content-type=text/plain&cvsroot=glibc

    list

    0x82 : 0x0BB8 0x0BCD 0x0BB0 0x0BC0

    So far, feedback from Tamil experts I got, seem to indicate that no
    satisfiable encoding
    exists and they would prefer a distinct codepoint, which was rejected.
    For example 0x0BB8 0x0BCD 0x0BB0 0x0BC0 is the word 'laughable' in Tamil.

    Alternatives given were
    (0BB8)(0BCD)(0BB1)(0BC0)
    (0BB6)(0BCD)(0BB1)(0BC0) (if and when U+0BB6 becomes Unicode)
    (0B9A)(0BBF)(0BB1)(0BC0)

    I'm far too clueless to re-start the distinct codepoint discussion, but
    rather look
    for a pragmatic solution for transcoding.

    Regards,
    Peter Jacobi
    Hamburg, Germany

    -- 
    NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
    Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService
    Jetzt kostenlos anmelden unter http://www.gmx.net
    +++ GMX - die erste Adresse für Mail, Message, More! +++
    


    This archive was generated by hypermail 2.1.5 : Sat Nov 01 2003 - 07:07:30 EST