RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n)

From: Rick Cameron (Rick.Cameron@crystaldecisions.com)
Date: Tue Dec 18 2001 - 18:38:07 EST


Are you planning to add an explicit statement to the Unicode standard that
the valid range for scalar values is 0..10FFFF? (Or is such a statement
there, and I've just missed it?)

In the absence of such a statement, I think it's very easy for people to get
the idea that the range of scalar values is unbounded above, and that any
limit is a property of a particular encoding.

In particular, as the use of 32-bit variables to hold Unicode characters
becomes more common (apparently most unices make wchar_t 32 bits wide), many
will imagine that such a variable represents a 32-bit encoding of Unicode,
with range 0..FFFFFFFF, where it just happens that every value above 10FFFF
is unassigned.

I am one such person (but no longer!)

Of course, the Unicode Standard 3.0 doesn't even mention a 32-bit encoding -
but that's not stopping uniphiles from storing Unicode data in their
wchar_t's!

Thanks

- rick cameron

-----Original Message-----
From: Asmus Freytag [mailto:asmusf@ix.netcom.com]
Sent: Tuesday, 18 December 2001 13:53
To: Rick Cameron; unicode@unicode.org
Subject: RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio
n)

At 10:38 AM 12/18/01 -0800, Rick Cameron wrote:
>It looks like UCS-2 and UCS-4 are defined in ISO 10646. Does that
>standard restrict the valid range of UCS-4 to 0..10FFFF?

It will with AMD1 to ISO/IEC 10646-1:2000 which is expected to pass final
balloting and head for publication in 2002.

>If not, does this represent
>a significant divergence between Unicode and ISO 10646?

No. Just one more area where the committees are working together to make
sure that the formal statements of both standards are completely
synchronized, despite starting from a different framework and approach to
standardization and somewhat different terminology as well. Getting the
last few wrinkles may take another amendment or so, but the will is there
to see it through.

A./

Technical Vice President
The Unicode Consortium
Liaison to ISO/IEC JTC1/SC2/WG2



This archive was generated by hypermail 2.1.2 : Tue Dec 18 2001 - 18:29:51 EST