Re: Normalization Form KC for Linux

From: Michael Everson (
Date: Sun Aug 29 1999 - 07:36:28 EDT

Ar 11:03 +0200 1999-08-29, scríobh Dan Oscarsson:

>> >For example:
>> >- having non spacing combining characters after instead of
>> >before base character.
>> I understood that it's much better to have them after.
>I have not yet located why. I can see ways were software can
>handle them much easier if they comes before.

I think it's a question of logic. One writes an o and then puts a ~ on top
of it. On typewriters, they switched this around because of the mechanical
movement of the carriage and nonspacing technology. But for instance a
smart font would take the bounding box

>Well the Swedish alphabet includes ä but not à. The last is an
>"a with an accent above", while ä is a letter in itself.
>Just because the glyph looks like an "a with two dots above", it is not.
>It is wrong to decompose a character just because its glyph looks
>like being composed of an accent and an other character.

So what? Your answer shows exactly what I said: that ä and à differ only in
the minds of Swedes.

1. Both are used in Swedish texts (2 biljetter à 40 kronor).

2. Both can be represented in the UCS in two ways: ä by 00E4 or 0065 +
0308, and à by 00E0 or 0065 + 0308.

3. à is sorted as a variant of a with a diacritical mark (p. 1 of Norstedts
svensk-engelska ordbok 1994).

4. ä is sorted as a separate letter of the alphabet (p. 855 of Norstedts
svensk-engelska ordbok 1994).

Technically, there is no difference except in Swedish implementation.

Michael Everson * Everson Gunn Teoranta *
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Guthán: +353 1 478 2597 ** Facsa: +353 1 478 2597 (by arrangement)
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT