Re: Unicode in source code. WHY?

From: Yung-Fong Tang (ftang@netscape.com)
Date: Tue Jul 20 1999 - 12:22:29 EDT


Torsten Mohrin wrote:

> Can someone give me at least one really good reason, why I should use
> Unicode in identifiers in programming languages? What's wrong with
> English and ASCII (and I mean "ASCII") and [A-Za-z_] ?

1. Same as part of the reason why we are using high level programming
language such as C++/C COBOL, Pascal, Java, instead of coding in ASM or
even machine code today. Of course, there are some other reason why we use
these high level programming language. Howerver, part of the reason we use
them is readability. The readability also depend on the target readers of
the program in turn of the whole life time of the program, not the target
readers of the code when the program developed.
2. If you can tell me a good reason why peopl in China, Korea, Rssian, and
Japan need to learn English first before they learn C++ them probably I
can tell you why ?

>
>
> UCNs are a good idea, e.g. in string literals, regular expression,
> resource files, config files and so on. But, IMHO, it's a very stupid
> idea to use Unicode characters in identifiers. I will never use them
> and I will forbid the programmers in my company to use them
> (fortunately I can do that). We use only English based identifiers.

That is becuase all your colleague use English. But that is not true for
software project which ONLY doing business in one region and won't scale
for other languages, with no one not knowing the languages ever need to
read the code, there are no reason they should not use identify in thier
language.

I think the whole thing make sense or not depend on
1. Who will read the code in the life time of this program
2. Can this software scale up to apply to other languages ?
3. Interoperatbility w/ current software facility, such as operating
system, library, import from foreign counties.

If there are any chance that
1. Non native language speaker may read the code in the life time of the
software
2. Need to use any software facility such as Operating System, Library,
utilities created from people not read/write that language.

If any of the above are true- then only use "common language" in the id is
a good practice. Notice I use "Common Language" , not English here. As
Today, English is the "common language" in the world so definitely it is
English. However, this statement is not true in 10th centry, neither
necessary true statement in 21th centry or 22th centry. Remember, English
is not even the "common language" in California during 16th centry,
neither the "common language" in south of USA during 18th centry. Who know
what language will be the "common language" 100 year from now, may be
Klingon, right ?

> What's the real advantage of using all possible letters in function or
> variable names? I think, the way Java and C9x go is wrong!

So we can write the program in Klingon (or Chinese, Zulu) in 21th centry
when Klingon is the common language around the world. ( See
http://www.sil.org/ethnologue/top100.html for details.)

:)

>
>
> --
> Torsten Mohrin
> Sharmahd Computing GmbH, Hannover, Germany
> Phone: +49-511-13780, Fax: +49-511-13450
> http://www.sharmahd.com, mohrin@sharmahd.com





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT