Re: explicit 20 bit Unicode range limit (was: UTF-20 etc.)

From: Rick McGowan (
Date: Tue Jan 26 1999 - 14:20:57 EST wrote...

> All I am trying to do is to think about what happens in Unicode
> implementations once characters with code points/scalar values above U+ffff
> are actually used, which I expect to happen more and more relatively soon.

Yes, that's fine to think about. Of course, nothing is encoded there yet,
so we have plenty of time to think about this...

> Costs are in adding or changing a couple of paragraphs in forthcoming
> editions of the Unicode Standard and of ISO-10646.

I beg to differ. This stuff is always more costly than you might think. If
it were a mere few paragraphs it might be rather trivial. I don't see any
big advantage to defining any new formats, and in fact, it creates more
problems than it solves. For instance:

. More paperwork, like shepherding the proposal through UTC and WG2
. More paperwork, like shepherding another RFC or two
. More paperwork, like documenting the differences, similiarities, and
relationships between the new UTF and the existing UTFs
. More paperwork, like a Tech Report and sample implementation(s)
. More confusion, which can be cleared up only by expenditure of more effort
in evangalizing, explaining, and documenting

Not to mention the Bad Press and Confusion generated by having yet another
format for what is just the same thing.

What actual problems does the proposal solve that outweigh all the required
paperwork? What's wrong with using a 32-bit container if you want to
normalize UTF-16 to avoid surrogate encoding?


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT