Re: Non-ascii string processing?

From: jon@spin.ie
Date: Wed Oct 08 2003 - 12:30:57 CST


> Of course it would have been possible to handle the "Astral Planes"
>
> uniformly by making every character in them a legal Char, but not a
> valid name character or name start character. This would have avoided
> silliness like elements named after the musical symbol for a six
> string fretboard or the damage of using undefined characters in XML
> documents. It also would have been much more compatible with existing
> parsers and tools. :-(

This would have created the opposite silliness of perfectly sensible name start characters being arbitrarilly disallowed. I'd like to see the musical symbol for a six string fretboard disallowed because we *know* what it is and we *know* it is category So and hence not an appropriate character for such use.

Similarly we know U+400FE will never be assigned, so I'd prefer to see it disallowed as such, from Char and from all other productions.

With unassigned characters (but not non-characters) doing anything other than allowing them would cause forwards-compatibility issues. However they (along with the private-use characters) are probably characters that should not be used for interchange.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST