RE: Whats the difference between a composite and a combining sequ ence?

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Tue Jul 09 2002 - 05:59:24 EDT


Theodore H. Smith wrote:
> http://www.unicode.org/unicode/reports/tr15/ mentions both
> composites and combining sequences.
>
> But it doesn't tell us the difference. I know what a combining
> sequence is. If I didn't know what a composite was, I'd guess it
> was the same thing as a combining sequence.
>
> However, the two are meant to be different, so it can't be the same.

They are meant to have exactly the same meaning, appearance and behavior.
The difference is only inside the computer's memory, and should be invisible
to users.

The purpose of the normalization algorithm above is to get rid of this
useless difference:

- Normalization Form D (NFD) turns any precomposed accented letter into a
letter + accent sequence.

- Normalization Form C (NFC) turns any letter + accent sequence into a
precomposed accented letter, if one exists.

BTW, they always sold me that precomposed accented letters exist in Unicode
only because of backward compatibility with existing standards. If this
compatibility issue didn't exist, Unicode would be like NFD.

> If I am getting the Unicode terminology correct, a combining
> sequence is like a plain ASCII letter A, with the accent
> following.

Yes. "Following" in the encoding order, but the corresponding accent glyph
should be on top of the letter glyph (or on bottom, etc.).

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 09 2002 - 04:15:39 EDT