Re: Transcoding Tamil in the presence of markup

From: Peter Jacobi (peter_jacobi@gmx.net)
Date: Sun Dec 07 2003 - 07:55:54 EST

  • Next message: Michael Everson: "RE: Transcoding Tamil in the presence of markup"

    Hi Christopher, All,

    > In Unicode U+0BBE, U+0BC6 and U+0BCA are all dependent vowel signs

    Yes, but just this fact doesn't meet user's expectations. It is inherited
    from
    the ISCII unification.

    > Since in some fonts a base character + combining vowel mark
    > might be displayed by a single ligature glyph, it makes sense to apply the
    > formatting of a base character to any dependant combining characters as
    well.

    In Tamil most vowels never form ligatures. (O.K. the exact value of 'most'
    has changed over time and was lower in the past, but its was never less
    than 7/11, to by best knowledge, and is now 9/11).

    But the core problem is not on the theoretical, but on the practical side.

    As it was possible to style individual characters in legacy encodings
    (heck, it was possible using a mechanical Tamil typewriter!), what is to
    be done in migration to Unicode?

    So, I'm still wondering whether Unicode and HTML4 will consider
      <span style='color:#00f'>&#x0BB2;</span>&#x0BBE;
    valid and it is the task of the user agent to make the best out of it.

    > In Mozilla you may be completely breaking the font lookups by separately
    > formatting the different parts of a conjunct.

    As I've understood Mozilla (i.e. Jungshik Shin) internally transcodes to
    TSCII
    before display. Or is this only be done on Linux?

    Regards,
    Peter Jacobi

    -- 
    +++ GMX - die erste Adresse für Mail, Message, More +++
    Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net
    


    This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 08:43:01 EST