Re: Nicest UTF

From: John Cowan (jcowan@reutershealth.com)
Date: Fri Dec 10 2004 - 19:38:59 CST

Next message: Asmus Freytag: "Re: US-ASCII (was: Re: Invalid UTF-8 sequences)"

Previous message: John Cowan: "Re: Nicest UTF"
In reply to: Philippe Verdy: "Re: Nicest UTF"
Next in thread: Marcin 'Qrczak' Kowalczyk: "Re: Nicest UTF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Philippe Verdy scripsit:

> If you look at the XML 1.0 Second Edition

The Second Edition has been superseded by the Third.

That is normative.

> But the comment following it specifies:

That comment is not normative and not meant to be precise.

> the restrictive
> definition of "Char" above also includes the whole range of C1 controls

By oversight.

> (#x80..#x9F), so I can't understand why the Char definition is so
> restrictive on controls; in addition the definition of Char also
> *includes* many non-characters (it only excludes surrogates, and U+FFFE
> and U+FFFF, but forgets to exclude U+1FFFE and U+1FFFF, U+2FFFE and
> U+2FFFF, ..., U+10FFFE and U+10FFFF).

By oversight again.

> Note however that nearly all XML parsers don't seem to honor this
> constraint (like SGML parsers...)!

Please specify the parsers that do and don't honor this. Any which
don't honor it are buggy, and any documents which exploit those bugs
are not XML.

> What is even worse is that XML 1.1 now reallows NUL for system
> identifiers and URIs, through escaping mechanisms.

Not true. U+0000 is absolutely excluded in both XML 1.0 and XML 1.1.

-- 
"I could dance with you till the cows           John Cowan
come home.  On second thought, I'd              http://www.ccil.org/~cowan
rather dance with the cows when you             http://www.reutershealth.com
came home."  --Rufus T. Firefly                 jcowan@reutershealth.com

Next message: Asmus Freytag: "Re: US-ASCII (was: Re: Invalid UTF-8 sequences)"
Previous message: John Cowan: "Re: Nicest UTF"
In reply to: Philippe Verdy: "Re: Nicest UTF"
Next in thread: Marcin 'Qrczak' Kowalczyk: "Re: Nicest UTF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Dec 10 2004 - 19:40:54 CST