Re: Transcoding Tamil in the presence of markup

From: Jungshik Shin (jshin@mailaps.org)
Date: Sun Dec 07 2003 - 09:34:36 EST

  • Next message: Mark E. Shoulson: "Re: Fwd: Re: Transcoding Tamil in the presence of markup"

    On Sun, 7 Dec 2003, Peter Jacobi wrote:

    Hi,

    > > [..] Anyway, he
    > > should have used 'lang' tag to help browsers pick up fonts. In two
    > > pages above, simply adding 'lang="ta"' to <table ....> would suffice.
    >
    > But I assume (and tested), that language tagging doesn't help
    > Mozilla in rendering the 'styled' example.

      Sure, it doesn't (for the reason I explained in another message.)
    My point was not that it'd help Mozilla render the style example but
    that language tagging in general is a good idea.

    > And, unfortunately, language tagging the TSCII version interferes with
    > the font hack to correctly display TSCII pages. If you want to able to see
    > TSCII and Unicode Tamil pages, Mozilla's font setup must associate a
    > Tamil Unicode font with 'Tamil' and a Tamil TSCII font with 'User Defined'.

      This is not an unreasonable requirement, is it? How could
    browser know what's meant by 'user defined'?

    > There is some mixup of lang and encoding tagging, which I didn't fully
    > understand.

       When lang is not explicitly specified, Mozilla resorts to 'infering'
    'langGroup' ('script (group)' would have been a better term) from
    the page encoding. Because UTF-8 is script-neutral, it's important to
    specify 'lang' explicitly. Your page is in ISO-8859-1 so that without
    lang specified, it's assumed to be in 'x-western' lagnGroup(well, Latin
    script). Anyway, this behavior slightly changed recently in Windows
    version (I forgot when I commited that patch, before or after 1.4)
    and each Unicode block is assigned the default 'script'. The way fonts
    are picked up by the Xft version of Mozilla makes it harder to do the
    equivalent on Linux.

    > > You're right. Anyway, this is an interesting challege to
    > > layout/rendering engines.
    >
    > Then you consider
    > <span style='color:#00f'>&#x0BB2;</span>&#x0BCA;
    > to be valid input, which ideally should render as intended?

      Yes, I do. As I wrote in another message, this thread should
    be interesting to people on W3C I18N-WG. You may consider moving (or
    crossposting) the thread to I18N WG's public mailing list.

      Jungshik



    This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 10:09:36 EST