Re: [REPOST, LONG] XML and tags (LONG) - SCSU for XML

From: John Cowan (
Date: Fri Feb 21 2003 - 21:59:54 EST

    Markus Scherer scripsit:

    > Yes. Any reasonable SCSU encoder will stay in the ASCII-compatible
    > single-byte mode until it sees a character from beyond Latin-1. Thus
    > the encoding declaration will be ASCII-readable.

    Indeed, there is no such requirement. A parser can perfectly well handle
    EBCDIC or other non-ASCII-compatible encodings provided a proper declaration
    expressed in that encoding is present.

    To be sure, some encodings, like US-BSCII, are problematic. US-BSCII is
    the same as US-ASCII except that 0x41 is B and 0x42 is A; the trouble
    being of course that the string "US-ASCII" encoded in US-ASCII uses the
    same bytes as the string "US-BSCII" encoded in US-BSCII. But such a thing
    is not likely to happen except through perversity such as this.

    John Cowan     
    To say that Bilbo's breath was taken away is no description at all.  There
    are no words left to express his staggerment, since Men changed the language
    that they learned of elves in the days when all the world was wonderful.
            --_The Hobbit_

