Re: Character name translations

From: Jukka K. Korpela <>
Date: Fri, 21 Dec 2012 09:17:59 +0200

2012-12-21 2:45, Asmus Freytag wrote:

> But when real people, not biologists, want to look up information they
> have precisely two choices: they can look at a visual index (for species
> that can be arranged visually) or they can look up the scientific name
> for the species based on the only thing they know: the local popular name.

No, they have many other choices as well. The can use tools that search
for various criteria, see for example.
Similarly, there are tools that search for Unicode characters by
different properties and other criteria, even by shape drawn:

In such searches, common names can be useful indeed, but then it’s a
matter of any common name for a character, not “the name”. Some purposes
I mentioned earlier, like discussing a character descriptively or
normatively, benefit from “the name”, i.e. an official name.

This means that it’s a matter of compiling information about names
actually used for characters, even if this means listing a dozen or more
names for “@”. So it’s about collecting data, not about setting
standards. Language authorities may wish to set standards on names of
some characters, especially those that are used in the orthography rules
of a language. But that’s just a small part of the issue.

>>>> So Unicode names should not be translated at all, any more than you
>>>> translate General Category values for example.
>>> Why wouldn't you?
>> Because those values are identifiers.
> No, names have multiple uses; especially if you take the formal name as
> one in a series of "aliases" for each character - that's why it's often
> more useful to think of translations of the full code charts and
> character index, instead of "just" the formal names. (The latter, by
> themselves are not so useful).

I think this reflects the idea of recording actual use of names, but in
an unnecessarily formal way. It’s not about translating anything,
really. The common English names mentioned for many characters in the
annotations of the standard are just examples common names in one
language. They may give ideas of the kinds of names other languages
might have, but that’s it. If a character has four common names
mentioned there, this does not mean I would need to find four
corresponding common names when considering the names of the character
in another language. Another language might have only one common name
for it, or it might have ten.

> The linguistic content of the short labels is indeed limited, however, I
> can see good reasons to provide alternate abbreviations for characters,
> e.g. for ZWSP or WJ, because these terms are used in places where they
> do not act as identifiers.

Abbreviations are yet another thing, along with names and identifiers,
and indeed very useful, even indispensable in some contexts (like
tables). It is possible to construct different language-specific
abbreviations, and some might be in actual use, but in almost all cases,
it seems best to stick to abbreviations like ZWSP or WJ, independently
of language. Perhaps the most commonly needed abbreviations might be
localized. But seriously, if I need to mention NBSP in Finnish and need
to use an abbreviation, I will surely use NBSP, expecting it to be
familiar to some of my readers, whereas for any abbreviation I make up,
everyone but me has to look for an explanation in the text. And if I
committee decided on an official abbreviation, the odds of making it
widely known (and widely accepted and use) would be very small.

Received on Fri Dec 21 2012 - 01:19:20 CST

This archive was generated by hypermail 2.2.0 : Fri Dec 21 2012 - 01:19:21 CST