From: Mark Davis (
Date: Thu May 10 2001 - 21:51:13 EDT

I had assumed that the parser would be rev'ed when I got IE 5.5 with the
latest patches. Is this always going to be an add-on, or will it be folded
in at some point?

Mark Davis, IBM GCoC, Cupertino
(408) 777-5850 [fax: 5892],,

"Michel Suignard" <> on 05-10-2001 18:41:25

To: Mark Davis/Cupertino/IBM@IBMUS
cc: <>, <>
Subject: RE: UCD in XML

Mark, I already answered that question a while ago to you. Our current
XML parser (msxml3.dll) parses Unicode 3.1 correctly. In fact I edited
your file to verify this. And indeed on a system with surrogate font
installed like mine it will even display the surrogate characters (part
of that font) correctly.
It is only the previous XML parser (shipped originally with the OS) that
has the problem.
Please go the Microsoft web site to get
the version 3 (which is conformant per your definition) or even if you
are brave there is a preview version (version 4) which will read XML
MSXML 3.0 has been available now for over a year and can be installed in
a way to be either used separately or through IE (read the info at the
web site).

So no need to comment anything out (except of course the #FFFF you left
in) to be readable by our current XML parser. And the perf is quite
acceptable given the size of the file.

PS please forward to Unicode as I will probably be blocked.

-----Original Message-----
From: Mark Davis []
Sent: Thu, May 10, 2001 5:58 PM
Subject: UCD in XML

Several people asked me over the last month about the XML version of the
Unicode character database that I presented at last November's UTC
I posted it at, containing two



1. I regenerated the data with Unicode 3.1 data. However, (a) I haven't
done more than spot-check the results, and (b) the format differs
from what is documented in the notes.

2. I still have to comment out characters FFF9..FFFD, and all
so that people can read the file with Internet Explorer (I do wish they
would use a conformant XML parser). Also, note that IE takes quite a
to load the file.

Mark Davis, IBM GCoC, Cupertino
(408) 777-5850 [fax: 5892],,

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT