Numerical character referneces in HTML (was: Multilingual Documents)

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Dec 06 1999 - 05:37:00 EST

Next message: Jonathan Rosenne: "Re: Support for Multilingual Documents"
Previous message: Edward Cherlin: "Support for Multilingual Documents"
Next in thread: Erik van der Poel: "Re: Numerical character referneces in HTML (was: Multilingual Documents)"
Maybe reply: Erik van der Poel: "Re: Numerical character referneces in HTML (was: Multilingual Documents)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Am 1999-12-03 um 10:11 h hat Erik van der Poel geschrieben:
> Even though [Netscape 4.X] doesn't support UTF-8 properly.

On the contrary, I am under the impression that Netscape 4.x does a
remarkable good job in supporting UTF-8 -- disregarding RTL languages
and bidi support, though.

However, there is an annoying deficiency in support of non-UTF encodings.

As said before in this thread, the document character set of any HTML 4
file is UCS, cf. <http://www.w3.org/TR/REC-html40/charset.html>. Hence,
any UCS character may be specified by a numerical character reference
(NCR), regardless of the content transfer encoding, cf.
<http://www.w3.org/TR/REC-html40/charset.html#h-5.3.1>. This would come
handy for including an occasional extra character, e. g. from the
General Punctuation block, with an otherwise 8-bit coded document.

However, this does not work with Netscape 4.x: though these browsers know
how to find glyphs for arbitrary UCS characters (if locally available, at
all), they apply this knowledge only for UTF-8 encoded files, cf. my ex-
amples <http://www.rz.uni-konstanz.de/y2k/test/Go-Latin.htm> vs.
<http://www.rz.uni-konstanz.de/y2k/test/Go-UTF.htm>.

Also, Netscape 4.x browsers do not recognize hexadecadic NCRs, cf.
my examples
<http://www.rz.uni-konstanz.de/y2k/test/Euro-UTF.htm>,
<http://www.rz.uni-konstanz.de/y2k/test/Euro-Latin-9.htm>, and again
<http://www.rz.uni-konstanz.de/y2k/test/Go-Latin.htm> and
<http://www.rz.uni-konstanz.de/y2k/test/Go-UTF.htm>.

(Btw., the Euro-Latin-9.htm file is encoded in ISO 8859-15, which only
the most recent Netscape versions (4.7, I think) process properly.
Whilst IE 5.0 still does not recognize ISO 8859-15.)

Erik: Which version of Netscape will mend those errors?
All: How do other browsers perform, in that respect?

Best wishes,
Otto Stolz

Next message: Jonathan Rosenne: "Re: Support for Multilingual Documents"
Previous message: Edward Cherlin: "Support for Multilingual Documents"
Next in thread: Erik van der Poel: "Re: Numerical character referneces in HTML (was: Multilingual Documents)"
Maybe reply: Erik van der Poel: "Re: Numerical character referneces in HTML (was: Multilingual Documents)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT