Re: Normalization Form KC for Linux

From: Michael Everson (
Date: Sat Aug 28 1999 - 06:32:28 EDT

Ar 01:54 -0700 1999-08-28, scríobh Dan:

>And if you work with linguistics, an ä cannot be decomposed when you
>work with Swedish, as it is a single letter. The dots above are not an
>accent or diacritic mark. So here is a case where you need to
>be able to represent what looks like the same glyph "an a with
>two dots above", both as one character and as an a with combining dots.

Uh, you mean that it can't be displayed as a¨, right? Of course the Swedish
letter can be decomposed in the text stream to a + combining diaeresis. If
you want to display a¨ you can either use a spacing diaeresis or a space +
combining diaeresis for the latter.

>I could say tha same, but for different reasons.
>For example:
>- having non spacing combining characters after instead of
>before base character.

I understood that it's much better to have them after.

>- not accepting that some glyphs that look like it is a combined
>character, is not a combined character but instead a character
>in itself and should have a code value of itself.

Apart from dotless j I can't think of any examples of this. Swedish ä does
not differ in any way from French (or indeed Swedish) à, except in the
minds of Swedes. By the same token, Welsh ll does not differ in any way
from English ll, except the Welsh sort it differently. It's a string of
characters considered as one.

The reason _I_ like non-decomposed characters is that I make fonts and
adequate support for smart fonts for font designers is only very very
incipient. There's a lot of promise in ATSUI, for instance, but I have yet
to be able to make use of it in any practical way.

