XML Parsers check BOM at the beginning of the Document.
if an XML Document starts with
0xfeff it is encoded in UTF16 (or UCS2),
0xfffe UTF16 byte-swapped architecture ,
0xfe00ff00 UCS4 and
0x00fe00ff UCS4 byte-swapped.
> -----Original Message-----
> From: Paul Deuter [mailto:Paul.Deuter@plumtree.com]
> Sent: Thursday, July 13, 2000 5:47 PM
> To: Unicode List
> Subject: Using Unicode in XML
> I know that XML can contain Unicode by using the declaration
> <?xl version="1.0" encoding="ISO-10646-UCS-2">
> But there seems to be a chicken and egg dilemma here. If
> I encode my whole XML stream as Unicode, then the parser
> will need to know that the stream is Unicode in order to be able
> to parse the declaration which tells it that it is Unicode.
> If the parser cannot figure out that the stream is Unicode, then
> it won't be able to read the declaration. But if it can recognize
> the Unicode, then the declaration would seem to be superfluous.
> How do systems handle this?
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT