Re: Unicode Technical Report #22

From: Mark Davis (mark.davis@us.ibm.com)
Date: Thu Mar 20 2003 - 14:05:51 EST

Next message: Michael Yau: "ANSI requires licence fees to use ISO language and country code?"

Previous message: Frank da Cruz: "Arabic country names"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

The only problem would come in would be if you were trying to read a CharML
file that *itself* was encoded using a character set that your XML parser
didn't know. That's one reason for encoding the CharML files themselves
always in UTF-8 or ASCII. I'll post this to a broader mailing list, since
some others may have similar concerns.

Mark
___
mark.davis@us.ibm.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799

                      "Claude Tardif"
                      <intmktg@cam.org> To: Mark Davis/Cupertino/IBM@IBMUS
                                               cc: <marc@sitepak.com>
                      2003.03.19 21:44 Subject: Unicode Technical Report #22

Your document referenced in the title of this message specifies an XML
format for the interchange of mapping data for character encodings.
Inversely, the Extensible Markup Language (XML) 1.0 (Second Edition)
section 4.3.3 specifies an entity for changing the character encoding of
XML formatted documents. If character encoding uses XML and XML uses
character encoding, there is necessarily an interdependency loop. For
example, what if a conversion library such as ICU parsed character
encoding files using an XML parser which itself used ICU to convert
character encoding in entities? Then, if the XML file defining the
charset encoding for ISO-8859-1 contained the entity <?xml
encoding='ISO-8859-1'?>, this would cause a loop as the character
encoding could never parse itself.

My question is: Is there a way for a conversion library and XML parser
to make use of their services mutually without causing such an
interdependency loop and, preferably, without having such requirements
as character encoding files not containing character encoding in
entities?

Marc Tardif

Next message: Michael Yau: "ANSI requires licence fees to use ISO language and country code?"
Previous message: Frank da Cruz: "Arabic country names"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Mar 20 2003 - 15:00:08 EST