Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

From: John Cowan (jcowan@reutershealth.com)
Date: Fri Feb 23 2001 - 11:21:27 EST


Mark Davis wrote:

>> A _code_point_ is an integer value which is assigned to an abstract
>> character. Each character receives a unique code point.
>
>
> inaccurate. Multiple *abstract characters* can have a single code point;
> multiple code points can correspond to a single *abstract character*.

TUS 3.0 is vague on this, but I suppose what is meant is that if two
single characters are canonically equivalent, they constitute only one
abstract character. Does U+0041 U+0300 represent one abstract
character (the same as the abstract character represented by U+00C0)
or two consecutive abstract characters? If the former, does U+0051
U+0300 also represent an abstract character?

-- 
There is / one art             || John Cowan <jcowan@reutershealth.com>
no more / no less              || http://www.reutershealth.com
to do / all things             || http://www.ccil.org/~cowan
with art- / lessness           \\ -- Piet Hein



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT