From: Anto'nio Martins-Tuva'lkin (antonio@tuvalkin.web.pt)
Date: Sun Sep 11 2005 - 07:55:24 CDT
On 2005.09.10, 23:40, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
> Unicode contains _most_ accented letters used in human languages
> as precomposed characters, but not all. There's a clear distinction
> here.
Considering what canonical decomposition means, and that e.g. U+006F
U+0301 is absolutely identical to U+00F3, that distinction, however clear,
is meaningless. And of course we know why precomposed characters were
added in the first place — it is about legacy encoding of previous
standards with different views on combining characters, not a desire to
make a "distinction".
> my text was supposed to address people's intuitive expectations
But Jukka, for people with nothing more than intuitive expectations about
computer text processing the backstage works of what's a character and
what's not are completely transparent — they should not worry their heads
with such aracana. ;-)
-- ____.
António MARTINS-Tuválkin | ()|
<antonio@tuvalkin.web.pt> |####|
Estrada de Benfica, 692-c/v d.ta Não me invejo de quem tem |
PT-1500-111 LISBOA carros, parelhas e montes |
+351 934 821 700, +351 217 150 939 só me invejo de quem bebe |
http://www.tuvalkin.web.pt/bandeira/ a água em todas as fontes |
This archive was generated by hypermail 2.1.5 : Sun Sep 11 2005 - 07:56:01 CDT