Addison P. Phillips wrote:
> currently there are no characters "up there" this isn't a really big
> deal. Shortly, when Unicode 3.1 is official, there will be 40K or so
> characters in the supplemental planes... but they'll be
> relatively rare.
This reminds me of a question that I wanted to ask since a lot time: how
rare is the most common of characters in the extended planes? Hmmmm... Maybe
I should be clearer.
Does it exist at least one character > U+FFFF that is commonly used in at
least one modern language?
I am wondering especially about the CJK characters in Extension B. We all
know that the majority of them are rare, ancient or idiosyncratic
characters, but I am not quite sure that this is true for *all* of them.
I think that this is an important question for deciding whether an
application should use 32 or 16 bit characters internally, and whether an
application has to be fully UTF-16 aware or it can be "UTF-16 ignorant".
E.g., imagine designing an application that will be localized in Cantonese:
it is important to know whether all characters needed in Cantonese are in
the BMP, or if some of them are in Extension B.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT