Dan Oscarsson wrote:
> So the Unicode hackers have finally succeeded to destroy ISO 10646.
> Just because Unicode made the bad decision to limit their character
> coding to 16 bits does not mean that we have to limit the available
> space for characters to less than 31 bits.
Are you suggesting that one day ISO 10646 is going to need more than million
characters? All the major living scripts seem to have more or less already been
encoded in the standard, bar a few odd characters. Amongst the so far unencoded
living scripts and historic scripts there seem to be only a few scripts (e.g.
Egyptian Hieroglyphs) which have a large character repertoire. It is hard to
imagine any characters beyond even plane 3 ever being required (except for
private use and things like the plane 14 language tags - and there is more than
enough space for those).
All proposals for additional characters in the ISO 10646 and Unicode standards
need to demonstrate those characters are unique, and each proposal has to be
discussed, balloted etc. in the relevant committees before they ever become part
of these standards. This process takes a lot of time and becomes more difficult
and protracted as the scripts being encoded are less well known about. There is
also little commercial incentive to get such scripts encoded quickly, few people
to champion their cause and their is some real opposition to even considering
many of them. Given all this the process of filling planes 1 and 2 is going to
take many, many years and most of us will probably be dead before we even get to
plane 3. Where is there ever gong to be a need of characters beyond plane 16 -
unless someone want to propose that DNA sequences be encoded as Unicode
characters or we make contact with several extraterrestrial species with their
Despite all the above I agree that 32 bits makes sense - even if you never
actually need the extra encoding slots the extra bits provide. Since it is the
UTF-8 kludge that has made this limitation necessary perhaps when we reach the
day that UTF-8 is no longer being used this limitation can be removed. I'm
quite sure that day will come long before we need to encode characters on plane
17 and beyond.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT