Re: Rationale wanted for Unicode identifier rules

From: Dan Oscarsson (
Date: Thu Mar 02 2000 - 02:30:54 EST

>In other words, programming languages have historically tended to allow
>anything in an identifier that wasn't used for some syntactic purpose;
>leading digits were forbidden to make lexers simpler. What specific
>reason is there not to treat all hitherto-unknown Unicode characters
>as legitimate in identifiers, in the manner of the Plan9 C compiler
>(which extends C to treat everything from U+00A0 on up as valid)?

One important thing to remember is that there are several types
of identifiers. For example, the ones used on variables and those used
on operators.
For variables it might be a good idea to restrict the characters allowed
to "word like", while an operator could use nearly any type of character.
When I define a new comparing operator, I do not want to call it
"equals", I want to call it "=" (or "==" if you are a C programmer).

When you leave the ASCII range there are many more good non-letters that
can be used. (for example, I use the not sign "" instead of ! in some
interfaces). So you have to allow many of the non-letters in some types
of identifiers.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT