Numerical character referneces in HTML (was: Multilingual Documents)

From: Otto Stolz (
Date: Mon Dec 06 1999 - 05:37:00 EST

Am 1999-12-03 um 10:11 h hat Erik van der Poel geschrieben:
> Even though [Netscape 4.X] doesn't support UTF-8 properly.

On the contrary, I am under the impression that Netscape 4.x does a
remarkable good job in supporting UTF-8 -- disregarding RTL languages
and bidi support, though.

However, there is an annoying deficiency in support of non-UTF encodings.

As said before in this thread, the document character set of any HTML 4
file is UCS, cf. <>. Hence,
any UCS character may be specified by a numerical character reference
(NCR), regardless of the content transfer encoding, cf.
<>. This would come
handy for including an occasional extra character, e. g. from the
General Punctuation block, with an otherwise 8-bit coded document.

However, this does not work with Netscape 4.x: though these browsers know
how to find glyphs for arbitrary UCS characters (if locally available, at
all), they apply this knowledge only for UTF-8 encoded files, cf. my ex-
amples <> vs.

Also, Netscape 4.x browsers do not recognize hexadecadic NCRs, cf.
my examples
<>, and again
<> and

(Btw., the Euro-Latin-9.htm file is encoded in ISO 8859-15, which only
the most recent Netscape versions (4.7, I think) process properly.
Whilst IE 5.0 still does not recognize ISO 8859-15.)

Erik: Which version of Netscape will mend those errors?
All: How do other browsers perform, in that respect?

Best wishes,
   Otto Stolz

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT