Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

From: John Cowan (
Date: Fri Feb 23 2001 - 11:21:27 EST

Mark Davis wrote:

>> A _code_point_ is an integer value which is assigned to an abstract
>> character. Each character receives a unique code point.
> inaccurate. Multiple *abstract characters* can have a single code point;
> multiple code points can correspond to a single *abstract character*.

TUS 3.0 is vague on this, but I suppose what is meant is that if two
single characters are canonically equivalent, they constitute only one
abstract character. Does U+0041 U+0300 represent one abstract
character (the same as the abstract character represented by U+00C0)
or two consecutive abstract characters? If the former, does U+0051
U+0300 also represent an abstract character?

