From: John Cowan (jcowan@reutershealth.com)
Date: Fri Dec 10 2004 - 19:26:22 CST
Marcin 'Qrczak' Kowalczyk scripsit:
> http://www.w3.org/TR/2000/REC-xml-20001006#charsets
> implies that the appropriate level for parsing XML is code points.
You are reading the XML Recommendation incorrectly. It is not defined
in terms of codepoints (8-bit, 16-bit, or 32-bit) but in terms of
characters. XML processors are required to process UTF-8 and UTF-16,
and may process other character encodings or not. But the internal
model is that of characters. Thus surrogate code points are not
allowed.
-- John Cowan www.reutershealth.com www.ccil.org/~cowan jcowan@reutershealth.com Arise, you prisoners of Windows / Arise, you slaves of Redmond, Wash, The day and hour soon are coming / When all the IT folks say "Gosh!" It isn't from a clever lawsuit / That Windowsland will finally fall, But thousands writing open source code / Like mice who nibble through a wall. --The Linux-nationale by Greg Baker
This archive was generated by hypermail 2.1.5 : Fri Dec 10 2004 - 19:27:54 CST