Re: Languages supported by UTF8 and UTF16

From: Anto'nio Martins-Tuva'lkin (antonio@tuvalkin.web.pt)
Date: Sun Sep 11 2005 - 07:55:24 CDT

  • Next message: Doug Ewell: "Re: How to encode underlined characters"

    On 2005.09.10, 23:40, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:

    > Unicode contains _most_ accented letters used in human languages
    > as precomposed characters, but not all. There's a clear distinction
    > here.

    Considering what canonical decomposition means, and that e.g. U+006F
    U+0301 is absolutely identical to U+00F3, that distinction, however clear,
    is meaningless. And of course we know why precomposed characters were
    added in the first place — it is about legacy encoding of previous
    standards with different views on combining characters, not a desire to
    make a "distinction".

    > my text was supposed to address people's intuitive expectations

    But Jukka, for people with nothing more than intuitive expectations about
    computer text processing the backstage works of what's a character and
    what's not are completely transparent — they should not worry their heads
    with such aracana. ;-)

    -- ____.
    António MARTINS-Tuválkin | ()|
    <antonio@tuvalkin.web.pt> |####|
    Estrada de Benfica, 692-c/v d.ta Não me invejo de quem tem |
    PT-1500-111 LISBOA carros, parelhas e montes |
    +351 934 821 700, +351 217 150 939 só me invejo de quem bebe |
    http://www.tuvalkin.web.pt/bandeira/ a água em todas as fontes |



    This archive was generated by hypermail 2.1.5 : Sun Sep 11 2005 - 07:56:01 CDT