Re: Displaying Plane 1 characters

From: John Cowan (cowan@locke.ccil.org)
Date: Thu Nov 12 1998 - 16:45:46 EST


Keld scripsit:

> You should not use \uxxxx nothation for surrogates,
> as surrogates are not charcters in neither Unicode nor 10646,
> and thus the short identifiers cannot be used.

In Java, the sequence '\uxxxx' where xxxx is precisely 4
hex digits represents a datum of the Java type "char",
a numeric value ranging from 0 to 65535. Java as such does not
understand surrogates, though Java applications may.
Therefore, "\ud800\ude08" is a Java String containing two chars.

Java chars = Unicode characters are not the same as
Unicode abstract characters = 10646 characters.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT