Re: Non-ascii string processing?

From: Doug Ewell (
Date: Wed Oct 08 2003 - 09:03:16 CST

Elliotte Rusty Harold <elharo at metalab dot unc dot edu> wrote:

> Of course it would have been possible to handle the "Astral Planes"
> uniformly by making every character in them a legal Char, but not a
> valid name character or name start character. This would have avoided
> silliness like elements named after the musical symbol for a six
> string fretboard or the damage of using undefined characters in XML
> documents. It also would have been much more compatible with existing
> parsers and tools. :-(

You can never completely avoid silliness -- just look at yesterday's

But the "undefined characters" issue is a greater problem. Limiting the
pool of valid name characters to those already assigned in Unicode X.X
would mean either:

(a) the XML spec would have to be updated promptly, 1 to 2 times per
year, to keep up with each new minor release of Unicode, or

(b) the characters accepted after Unicode X.X would be excluded,
creating one of those "digital divide" issues when someone wants to
create a Buginese or Tai Lue identifier and can't.

-Doug Ewell
 Fullerton, California

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST