Re: Transcoding Tamil in the presence of markup

From: John Delacour (JD@BD8.COM)
Date: Sat Dec 06 2003 - 20:37:10 EST

  • Next message: John Hudson: "Coloured diacritics (Was: Transcoding Tamil in the presence of markup)"

    It's not clear to me what you are saying. I have viewed markup-uc.htm
    with Safari and OmniWeb on the Mac and with IE 6.0.2800.11061S and
    Opera 7 on a machine running Windows NT4 and there is no major
    problem with the styled display which is precisely as you specify in
    the source. Two things might just conceivably help other browsers
    along -- to use decimal rather than hexadecimal entities, and to
    declare utf-8 as the character set -- but this won't make a scrap of
    difference to a browser such as IE 5.2.3 for Mac (and many others)
    because it simply can't deal with anything that won't convert to
    standard legacy character sets.

    I have come across several instances of Win IE 6 not diesplaying
    Unicode characters as it should or not at all and there are probably
    many such instances outside my experience -- whether due to the OS or
    to MSIE -- but a good up-to-date browser will not misbehave.

    JD

    At 7:39 pm +0100 6/12/03, Peter Jacobi wrote:

    >In Unicode:
    > lA <span style='color:#00f'>&#x0BB2;</span>&#x0BBE;
    > le <span style='color:#00f'>&#x0BB2;</span>&#x0BC6;
    > lo <span style='color:#00f'>&#x0BB2;</span>&#x0BCA;
    >
    >It is easy to see, that simple n:m mapping cannot make this conversion.
    >It is not that easy to judge whether this is the desired conversion at all.
    >And what should the receiving software should do with it.
    >
    >Some tests: In Mozilla 1.4.1 the characters fall apart and in IE5.5 the
    >style expands to the entire orthographic syllable.
    >Unicode test page: http://www.jodelpeter.de/i18n/tamil/markup-uc.htm
    >TSCII test page: http://www.jodelpeter.de/i18n/tamil/markup-tscii.htm
    >
    >After seeing this effect at its source, it's now clear why you can't style
    >individual Tamil characters in a word processor, when using Unicode (whereas
    >you can do so, in legacy encodings).
    >
    >It's hard to promote Unicode, when things that have worked in the past,
    >stop working.



    This archive was generated by hypermail 2.1.5 : Sat Dec 06 2003 - 21:44:37 EST