From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Wed Aug 13 2003 - 09:05:26 EDT
On 13/08/2003 04:44, Jon Hanna wrote:
>No, the safe thing to do (and the thing that is done) is to treat the space
>as a space ignoring the fact that the NMTOKEN contains a combining
>character, this is even safer than your suggestion since it can't
>mis-identify the combining properties of a character.
>
>
OK, it's safe, but it is a misuse of Unicode. As space plus combining
character is a unit in Unicode, it should be treated as a unit by higher
level protocols. If higher level protocols are allowed to do arbitrary
things within Unicode units, there is no end to the possible confusion.
See for example, from Unicode 4.0 chapter 3:
C7 A process shall interpret a coded character representation according
to the character
semantics established by this standard, if that process does interpret
that coded character
representation.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Aug 13 2003 - 09:40:52 EDT