Re: Linux & Unicode

Date: Thu Dec 03 1998 - 10:38:44 EST

I am disappointed by one thing:

Arnt Gulbrandsen <> :

There is one big hole in the Unicode support of Qt (and consequently
that of the KDE): We don't plan to support that hacky extension trick
to access the next sixteen ISO10646 planes, at least not until there
is serious user demand for it. The "everything is a 16-bit character"
principle is too simple and uniform to let go, IMO.

In other words, right now, Qt supports UCS-2, not UTF-16.

As far as I can see, that "hacky extension trick" is there because 32b
everywhere for all characters seem unacceptable, and because a full
repertoire does not fit into straight 16b. True, it looks like it will be
at least a couple of years before characters beyond the BMP are officially
assigned, because there are more important scripts to go into the BMP
first. But there are proposals and suggestions for several 10k characters
to populate planes 1 and 2 (and 14), and there is the private use area in
planes 15 & 16.

Does this mean that for now and the near future, with Linux/Qt/KDE,

- no one can use the non-BMP private use area
- new assignments and fonts for them cannot be prototyped for
- the language tags in plane 14 that the Internet Mail Consortium proposes
to use (prematurely I know since they are not really standard yet) cannot
be used


What is missing?
Is it the font technology not being capable of handling >1M character
positions and up to 100k chars?
Is it that those characters get treated as two unknowns?
Is it that conversions to/from UTF-8 fail?

Please, I would appreciate seeing the general plumbing for UTF-16,
including the part that sets it apart from UCS-2, in place sooner rather
than later. I am not asking to provide fonts for characters that aren't
there yet, only for making it possible to provide them.


PS: Aren't there some people trying to use plane 15 code points for

Markus Scherer IBM RTP +1 919 486 1135 Dept. Fax +1 919 254 6430

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT