Re: Perception that Unicode is 16-bit (was: Re: Surrogate space in Unicode)

From: Kenneth Whistler (
Date: Tue Feb 20 2001 - 15:10:29 EST

Tobias Hunger said:

> Looks like David was quoting me. I am working on Babylon and wanted to make
> clear that it is not unicode conformant as its API uses 32bit wide characters
> which violates clause 1 of Section 3.1.

No longer, as Peter pointed out.

> Babylon can im-/export UTF-8/16/32
> (UTF-7 is in the works) though, so I'm aiming for 'unicode compliant
> interchange of 16bit Unicode characters' with Babylon. For more details
> please see pages 107/108 of the Standard.

Also out of date. This was also subjected to a major revision in the just-completed
UTC meeting.

These actions were taken to make it clear to everyone that use of a 32-bit
encoding form is *not* inconsistent with a claim of compliance to the Unicode
Standard, now that UTF-32 has been officially added as a sanctioned encoding
form. From this date forward, no one should have to jump through hoops to
explain how their 32-bit wide character implementations are and are not
conformant to the Unicode Standard.

Antoine Leca said:

> wrote:
> >
> > Eh? Unicode has no aversion to either a 32-bit encoding form (UTF-32 - see
> > UTR#19 or PDUTR#27) or with C++.
> Read also TUS3.0, par. 5.2 on top of page 108...
> As far as I know, neither UAX-29 nor PDUTR-27 has changed these words...
> That said, one can see it as a overview that ought to be corrected.
> As the guy that fighted to introduce the most wide uses of ISO10646/Unicode
> in C99, I will certainly welcome any change in this area! ;-)

All taken care of in the rewrite of section 5.2, based on the last
UTC meeting's review of the text of PDUTR #27.

In general, folks, please calm down a little. The text of PDUTR #27 is
out-of-date -- it was a *Proposed Draft*, after all, for review by
the UTC. And the editorial committee has been working furiously to update
the text for final posting. We decided not to publicly post a bunch of
intermediate drafts every 3 days during this process, to avoid generating
more confusion about the text drift. But the scheduled date for the
next public draft of what will become UAX #27 in the final Unicode 3.1
release is this Friday, February 23.

I cannot promise that all issues will be resolved and all truth will
be revealed in that document, but much of what has been discussed on
this thread should become moot.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT