Re: CJK(B) and IE6

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat May 01 2004 - 17:11:48 CST


Note that I can successfully display extended Chinese with the registry settings
applied to support surrogates, whever the page is coded with GB18030 or with
Unicode. My question was about the native support of GB18030 in Windows 2000/XP,
China Editions: does Windows comes preset with this registry setting, or is the
registry setting also absent by default, but still able to display extended
Chinese provided that the page is encoded with GB18030 and not a Unicode UTF?

It's just a shame that IE will differentiate NCRs and a standard UTF-8 encoding.
Using NCRs is really a ugly nightmare. I tried to encode a page with CESU-8 or
with UTF-16 surrogates without more success on my French edition of Windows,
unless the registry setting is applied.

Also the registry settings is very ugly: you can define only one font for all
scripts requiring use of surrogates. If you map the SimSun18030 font there, then
you can't display other scripts added in plane 1. I think that Microsoft does
not document too much this registry settings, just because it was a temporary
solution probably made in emergency just to support the GB18030 charset whose
support was required in 2000 to be able to sell Windows 2000 in P.R. China.
Shamely this "solution" was maintained 2 years later when Windows XP was about
to be released. And as well in Windows 2003, Office XP, ...

I really hope that a better support will soon be added to Internet Explorer:
more than 4 years have passed since this registry hack was implemented, and it's
high time that a better solution be deployed and made available to users by
Microsoft.

Is this situation blocked at Microsoft with no real volontary commitment to put
this development for the support of characters out of surrogates into
production? This lack of support in IE is blocking other developments by other
vendors and font foundries, because they will legitimately think that the
absence of a suitable support in a browser that represents today more than 90%
of the browsers market (including alternate browsers based on the IE engine)
will not incite them to invest into the area of characters out of the BMP.

Now that the support of the Ideographic supplementary plane is wanted by users
of CJK languages, how can Microsoft continue to ignore billions of them? Also,
there are now many scripts in the SMP that searchers and students around the
world would like to use and deploy more easily. Correcting the UTF-8 support for
characters coded on more than 3 bytes in IE should not be considered a low
priority, as it causes nightmares to web designers.

There's certainly a much larger community of users that would like to adapt
their product to characters out of the BMP, but will not do that as long as
supplementary planes will not be made compatible with IE, that Microsoft has
promoted since long as an API of Windows, used in many internal services as well
as a key element to build local user interfaces out of the web, even if the
browser UI itself can be disabled and substituted freely by another browser
(such as Gecko-based browsers, including Mozilla). Due to this absence of
correct support in IE, many tools will be developed without correct support of
UTF-32, surrogates and characters out of the BMP.

Sun has done the job in Java 1.5. Will Microsoft finalize some workable solution
for this issue?



This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT