re. XML identifiers
An interesting document I've seen recently proposes an encoding scheme to
represent Unicode characters that are otherwise not allowed in XML 1.0
identifiers. I can't share the specific proposal, but I expect the principle
should be well understood by the members of this list. It uses '_' as an
escape character to signal a notation for an arbitrary Unicode character
(some simple additional rules are applied to decide if you have an encoded
non-XML1.0 character). It's similar to other schemes to extend the range of
a representation required by a given process (such as DNS names).
Using such a scheme allows you to present names to the user using the full
range of Unicode (any version) and keep complete compatability with XML 1.0.
These simple escape/encoding schemes seem to be a common technique for
extending standards/syntaxes. I applied this myself in a tool for encoding
WinHelp topic identifiers.
Perhaps authors of new standards can anticipate this need and define the
extension mechanism as part of the initial standard, instead of having the
--- Paul Chase Dempsey
Microsoft Visual Studio Text Editor Development
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT