RE: Apostrophes, quotation marks, keyboards and typography

From: Jonathan Coxhead (jonathan@doves.demon.co.uk)
Date: Mon Jul 19 1999 - 23:13:24 EDT


 | > > ... ASCII for
 | > > identifiers is just fine, and if it forces software engineers to stay
 | > > with English identifiers, then trust me, this is a feature, not a bug.
 | >
 | An example where nonASCII identifiers is really useful is in coding
 | up mathematical formulae that contain Greek letters. For example, a
 | program is much more readable if you use U+3B1 for alpha rather than
 | spelling out the name alpha. Similarly U+3C0 for pi. Hopefully C++
 | will follow Java's excellent example and allow Unicode alphabetics in
 | variable names.

   C++ already allows what it calls "universal characters" (i e,
Unicode) in identifiers, and C9x, which may or may not yet live up to
its name by being published before Jan 1st next year, has followed suit
and specified the idea more precisely.

   In a C9x programme, you can use an identifier such as 'âge', and you
can spell it '\u00E2ge' (or even '\U000000E2ge') if you prefer. The 3
forms are equivalent, but the first depends on the existence of an
a-circumflex in the source character set. I imagine we will see
development environments that use the others (called "universal
character names" or U C N's) behind the scenes, but display and accept
the 1st form. Some of the Unicode character property database is
included in the draft standard (explicitly, not by reference), so the
situation is likely to cause no end of problems in the future should
they diverge. And in this world, they *always* diverge :-)

   They stopped short of requiring that 'a\u0302' (LATIN SMALL LETTER A
+ COMBINING CIRCUMFLEX) be equivalent to 'â' (LATIN SMALL LETTER A WITH
CIRCUMFLEX) and '\u00E2' (U C N for the latter), as well.

 | One particularly intriguing possibility is the use of Chinese
 | characters. Imagine the expressive power of relatively short program
 | statements if you could use such succinct representations of ideas.
 | Of course, you need to read Chinese...!

   This would be easy, just needing the right editor. I'm also amused
by the idea of defining 'operator \u2295', and writing expressions with
CIRCLED PLUS, and the other Unicode operator symbols, though that's not
on the horizon at the moment.

        /|
 o o o (_|/
        /|
       (_/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT