Re: Unicode on a non-Unicode web page

From: John Cowan (jcowan@reutershealth.com)
Date: Thu Sep 07 2000 - 12:47:23 EDT


"Gary P. Grosso" wrote:

> Netscape Communicator 4.6 doesn't.

Versions of Netscape before 4.7 had this bug: character references greater
than ÿ only worked if the transmission character set was UTF-8.

> One way to look at this is: how do I use unicode as an
> "escape" to include some isolated content on a web page
> of arbitrary encoding?

You really can't. Either put the whole page into UTF-8 (either with
actual UTF-8 byte sequences or with character references) or stick to
a single charset. Or, of course, upgrade your clients to NN 4.7.

> If I change the meta tag to:
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> then Netscape does slightly better (still stumbles over &#x-anything

Hexadecimal character references did not exist in SGML until 1998,
so it's not surprising if HTML systems don't handle them yet.

> and doesn't display the hiragana, but does display the DJE and GAMMA
> if I use decimal values) but of course now the Czech words are not
> displayed properly.

You can switch to decimal Unicode character references (in the range
&#256; to &#511;) for the Latin-2-specific characters.

-- 
There is / one art                   || John Cowan <jcowan@reutershealth.com>
no more / no less                    || http://www.reutershealth.com
to do / all things                   || http://www.ccil.org/~cowan
with art- / lessness                 \\ -- Piet Hein



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT