I think you are overstating the situation. I would say that there is a
growing consensus in the direction you are saying. There is nothing on
the roadmap beyond 21 bits, everything beyond 21 bits has been removed,
but, unless I have missed a resolution, only UTF-16 and UTF-32 are so
limited. I admit this distinction is probably strictly academic (so maybe
I shouldn't even be pointing it out).
In any case, I am interested, if anyone has plans for these 11 bits, what
these plans might be. I would also like to caution folks that these bits
must be 0 when you send it to another UTF-32 compliant task, and that
whenever you write stuff to disk, another UTF-32 compliant task may indeed
be looking at your data (and certainly don't send those dirty 11 bits out
on the web).
On Wed, 5 Apr 2000, Juliusz Chroboczek wrote:
> "Christopher John Fynn" <firstname.lastname@example.org>:
> CF> Despite all the above I agree that 32 bits makes sense - even if
> CF> you never actually need the extra encoding slots the extra bits
> CF> provide.
> As people have already noted, the limitation to 21 bits is useful for
> implementations. I read the 21 bit limitation as an official state-
> ment saying ``feel free to use the top 11 bits of every word for any
> purpose you see fit, and rest assured that your choice of data
> representation will not need to be revised as we add new codepoints''.
> I am grateful for this reassurance.
> I am sorry to bring such down-to-earth considerations as implemen-
> tation and efficiency issues to this list. To quote one of my former
> lecturers, while a beautiful theoretical construction doesn't need
> justification, having applications can never harm.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT