Sun Jan 12 1997 - 07:45:54 EST

At 11:00 -0800 1997-01-11, Maurice Bauhahn wrote:
>It is not discrimination against non-Roman scripts that Unicode is trying
>hard to eliminate ligatures and variant glyph is to their
>advantage! Think of how complicated search engines and sorting routines
>must be to sort out that redundant confusion! The reason Latin scripts
>have those presentation forms is because of legacy 'standards' concerns
>that we should be happy many non-Roman scripts can be freed from. Don't

There is another side to this, Maurice. A lot of languages use a lot of
accented characters. The Irish word "éirígí" has six letters in it. In
Latin 1 it has six characters in it. In decomposed Unicode encoding it has
nine characters (e´iri´gi´). So one thing that canonical decomposition does
is increase the size of our files. Considerably. In Europe, we have
considered this objectionable.

Naturally it doesn't much affect file size in English. But it does in
Irish. And Czech, Polish, Icelandic....

