RE: XML, HTML and unicode

From: Mike Brown (
Date: Fri Jul 07 2000 - 20:06:18 EDT

Ranganathan wrote:

> I wrote a small XML. I am reading the XML from a html
> file and populating the "span id"
> [...]
> The XML file contains characters for all languages [...]
> I saved the xml file in unicode format. I have included
> charset as utf-8 in the html file. The html file is able
> to able to read the aml file and populate properly but
> only junk appears. I tried using GB2312 as charset also.

I assume you mean that when your browser is rendering the dynamic HTML, it
is showing characters that you don't expect. It sounds like perhaps your XML
file and HTML file actually have different encodings, and your browser &
script are pulling byte sequences from the XML file and putting them into
the HTML, without first converting them to the charset of the HTML file. How
are you getting things out of the XML from within your HTML? Are you using
an actual XML parser?

Also, you can't make up charsets and hope they'll work; the charset you put
in the XML document's XML declaration needs to be the actual encoding used
in that file, just as the charset in the HTML document's Content-Type needs
to be the actual encoding used for the HTML document.

   - Mike
Mike J. Brown, software engineer at My XML/XSL resources: in Denver, Colorado, USA

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT