RE: Character name translations

From: Whistler, Ken <ken.whistler_at_sap.com>
Date: Thu, 20 Dec 2012 19:44:45 +0000

Jukka Korpela noted:

> The standard ISO 10646, which is equivalent to Unicode as regards to
> character names, is published in French, too

Actually ISO/IEC 10646 is *not* published in French, too.

But a related standard, the international string ordering standard, ISO/IEC 14651 (the one whose main weight table is maintained in synchronization with the Unicode Collation Algorithm), *is* published in both English and French. This is the result of heroic effort by the editor of 14651, Alain LaBonté. What is pertinent about that for this particular topic is that Alain maintains a translation database which he uses to update all the *names* of the Unicode characters printed in the weight table for ISO/IEC 14651 to French names.

So if you go to the ISO catalogue store for ISO/IEC 14651, you can find both English and French versions, and the latest version contains the weight table with all the latest French name translation information that Alain maintains for the characters in the table. (All of Unicode, except the predictable CJK unified ideographs and Hangul syllables, which don't really need one-by-one translations, anyway.) Unfortunately, the table format is aimed at collation, rather than easy extraction of translations for character names. And the standard costs CHF 146,00 -- it isn't one of the freely available downloads.

Don't forget that there is also a (somewhat tongue in cheek) *American English* translation of the Unicode 4.1 names list available in UTN #24:

http://www.unicode.org/notes/tn24/

This listing does exemplify some of the issues related to turning some of the peculiar character name choices in the standard into labels more recognizable to end users (e.g., the issues of "FULL STOP" --> "PERIOD", "SOLIDUS" --> "SLASH", and so on), even in contexts where you are using the English character names.

Finally, I endorse the various caveats of Asmus and Mark in responding to this thread. One needs to keep in mind the various use scenarios in presenting what is ostensibly a "character name" to a user. It is far more useful to analyze those scenarios and figure out what would be most useful and meaningful to end users than it is to assume that "the problem" would somehow go away if there were maintained, standard lists of translations of just the formal Unicode character names into various languages.

--Ken
Received on Thu Dec 20 2012 - 13:48:00 CST

This archive was generated by hypermail 2.2.0 : Thu Dec 20 2012 - 13:48:01 CST