Re: Cultural bias

From: Michael Everson (everson@indigo.ie)
Date: Sun Jan 12 1997 - 07:45:54 EST


At 11:00 -0800 1997-01-11, Maurice Bauhahn wrote:
>It is not discrimination against non-Roman scripts that Unicode is trying
>hard to eliminate ligatures and variant glyph forms...it is to their
>advantage! Think of how complicated search engines and sorting routines
>must be to sort out that redundant confusion! The reason Latin scripts
>have those presentation forms is because of legacy 'standards' concerns
>that we should be happy many non-Roman scripts can be freed from. Don't
>complain...rejoice!

There is another side to this, Maurice. A lot of languages use a lot of
accented characters. The Irish word "éirígí" has six letters in it. In
Latin 1 it has six characters in it. In decomposed Unicode encoding it has
nine characters (e´iri´gi´). So one thing that canonical decomposition does
is increase the size of our files. Considerably. In Europe, we have
considered this objectionable.

Naturally it doesn't much affect file size in English. But it does in
Irish. And Czech, Polish, Icelandic....

--
Michael Everson, Everson Gunn Teoranta
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire (Ireland)
Gutháin:  +353 1 478-2597, +353 1 283-9396
http://www.indigo.ie/egt
27 Páirc an Fhéithlinn; Baile an Bhóthair; Co. Átha Cliath; Éire



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT