Re: surrogate terminology

From: Peter_Constable@sil.org
Date: Tue Sep 12 2000 - 21:38:45 EDT


On 09/12/2000 02:59:38 PM Kenneth Whistler wrote:

[snip]

I think Ken's comments on planes is good.

>3. The term "surrogate character" should be eschewed altogether, because
> of the confusion is causes. "Surrogate code point" can continue to
> be used as it currently is, and the term "surrogate pair" is also
> useful. But the other terminology related to characters...

The other terminology Ken discussed had to do with the plane in which a
character is found. What I think is still open is how d800 - dfff get
referred to. Ken indicated that "surrogate code point" can continue in use
as is; I don't recall exactly how TUS 3.0 uses it. (Would have made for a
rather challenging trivia question :-) My biggest concern here is that
people should not be referring to U+d800 - U+dfff as characters. (I'd be
willing to accept code point, provided there is a clear statement as to
what is meant by a code point.) For that matter, I'd be inclined to say
that the U+ notation should not be used here - U+ should be reserved for
use to refer to encoded characters in terms of their Unicode scalar values.
So, 0xd800 is OK, but U+d800 would be wrong.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT