Re: Unicode in source code. WHY?

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Tue Jul 20 1999 - 19:33:12 EDT


"Patrick Andries" wrote on 1999-07-20 18:41 UTC:
> De : Torsten Mohrin <mohrin@sharmahd.com> écrivit :
> >To use only one language (and character set) in source code is a
> >matter of source code maintenance. My English is not the best, but the
> >quality of source code improved a lot after switching from German to
> >English.
>
> This is very interesting: could you explain why switching from German to
> English [in the use of identifiers] improved the quality of your source code
> ? This sounds counter-intuitive since one would expect that native German
> speakers immediately recognize German words, remember them easily, type them
> with more ease than foreign words, finally these words would also describe
> their function in a more comprehensible and accurate fashion (the developper
> having a wider vocabulary in his mother tongue).

Your intuition is wrong here. As somone who grew up in the same Central
European tribe as Torsten, I can assure you that technical conversations
between German information technology experts gain much in clarity if we
use English vocabulary than if we tried to introduce localized
equivalents - as the French or some elderly German computer science
professors try to do. For technical terms like "port", "interface",
"pixel", "socket", "byte", "header file", "thread", "backtracking",
"routing", "back propagation", "pruning", "hashing", "little-endian",
etc., it is *completely* irrelevant to us what the original
pre-computing era English meaning of this word might be and what the
most suitable German translation would be. These are precisely defined
technical concepts in this context that have very little to do with the
original native meaning of these English words. In fact, I think it is
conceptually even useful to introduce a new foreign word, because this
signals that you are dealing with an abstract concept that you have to
understand first, no matter whether the term comes from a language that
you know or not. Someone who does not know what a "TCP socket" is, will
have zero advantage from being a native English speaker when it comes to
guessing what a TCP socket might be. Using terms of your native language
just gives you the illusion that you know what you are talking about,
and that can frequently be a problem. Isn't that also why doctors use
Greek and Latin, to make sure that their precise terminology is not
polluted by spurious other meanings that native words might have?

The only problem that I see with introducing English vocabulary into
German is that the spelling/pronunciation relationship that usually
follows rather simple and predicatable principles is severely degraded.
The same happened with English centuries ago, which is why English has
such a completely bizarre spelling today (-ough-, etc.). I guess, German
spelling is heading the same way and I fear that we will not be able to
keep up with spelling reforms every 100 years or so to keep German a
language with a nice rather phonetic writing system.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT