--- Asmus Freytag <firstname.lastname@example.org> wrote:
> There are 66 non-characters as of Unicode 3.1, there
> were 34 non-characters
> There are no "hidden" non-characters, but there were
> 'hidden' planes in
> Unicode 3.0
> - hidden in the limited sense that they were defined
> as character and
> locations, but no characters were assigned, other
> than the private use
I understand now.. the non characters in 16 higher
planes were defined first, then the ones in the arabic
presentation forms block. In this case it is as I
suspected, just a documentation problem. The book says
"None of these surrogate pairs has been ASSIGNED in
this version of the standard" (emphasis mine). It
would merely be misleading to not mention 32 non
characters in the section called "non characters" and
to state that there are no characters in the higher
planes as of Unicode 3.0; but I think we have a bona
fide incorrect statement to say that no surrogate pair
has been ASSIGNED when in fact 32 surrogate pairs were
assigned the status of non characters. My "hidden"
nickname stands for these 32 surrogate pairs, sorry
Asmus. The 32 non characters in the arabic
presentations form block I'll call the "arabic" non
> The reason to put the additional (defined in 3.1)
> non-characters into the
> BMP is to allow them to have single codes for UTF-16
> implementation -
> something that doesn't
> work so well if they are on the higher planes.
I don't understand this, the "arabic" non characters
are supposed to REPRESENT the "hidden" non characters?
Do You Yahoo!?
Listen to your Yahoo! Mail messages from any phone.
This archive was generated by hypermail 2.1.2 : Tue Oct 02 2001 - 00:25:46 EDT