Re: CJK(B) and IE6

From: jameskass@att.net
Date: Sat May 01 2004 - 21:03:34 CST


The lack of support for supplementary characters expressed in UTF-8
in the Internet Explorer is a bug. As Philippe Verdy mentions, the
Mozilla browser does not have this same bug. Also it should be
noted that the Opera browser handles non-BMP UTF-8 just fine.

While working with NCRs may be an ugly nightmare, there are some shortcuts.

The BabelPad editor can easily convert between UTF-8 and NCRs. Also,
even though Internet Explorer doesn't display the material, it doesn't
destroy the encoded text, either. It can be copy/pasted from the browser
window into any aware application and retain its content.

The Internet Explorer browser itself can convert between UTF-8 and NCR
encoding forms with the "File - Save As" command.

The Windows registry settings allow a default font to be specified for
any plane. I have one font set for Plane One and a different font
set for Plane Two in my registry, and Windows seems to handle this well.
(Except for the UTF-8 bug in Internet Explorer.)

Note also that it is possible to set a font other than the default font
for displaying non-BMP text, just as it's possible to change the font
in an HTML file. Either with CSS or font-face/family tags. The registry
settings should only be for default, in other words if the application
or mark-up has not specified another font.

I *think* that Windows 2000 uses Unicode always internally and uses an
internal conversion chart if material is non-Unicode like GB-18030. As
far as I know, this means that GB-18030 support on Win2000 would be
limited to Unicode's BMP unless the special registry settings were made.
But, I could be wrong on this. Since GB-18030 is important to many, it's
very possible that Microsoft already made allowances for this.

Best regards,

James Kass



This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT