[Unicode]   Technical Notes

Unicode Technical Note #24

Sample American English Translation of Unicode Names List

Version 1
Authors Ken Whistler
Date 2005-07-15
This Version http://www.unicode.org/notes/tn24/tn24-1.html
Previous Version n/a
Latest Version http://www.unicode.org/notes/tn24/


This technical note provides an example of how the Unicode character names list [Names] for Unicode, Version 4.1.0, may be translated into other languages. This translation is an American English translation.


This document is a Unicode Technical Note. It is supplied purely for informational purposes and publication does not imply any endorsement by the Unicode Consortium. For general information on Unicode Technical Notes, see http://www.unicode.org/notes/.

1 Introduction

This technical note provides an example of how the Unicode character names list [Names] for Unicode, Version 4.1.0, may be translated into other languages. This translation is an American English translation.

The translated names list in the accompanying data files is provided only for informational purposes, and is not part of the Unicode Standard. The author has no intention of updating or maintaining translation to match future versions of the Unicode Standard, so people who use the file use it at their own risk.

The idea is to demonstrate, through example, how translation of the names list can work, to provide an informative list of information about Unicode characters, without having to match exactly the sometimes confusing normative Unicode character names in the standard. Such translations can be used beneficially, for example, in discussions about characters, or in a user interface, where the concern might be more about making sure that the person using the name is clear about the identity of the character at they know it, rather than needing to exactly match the normative character name in the standard.

The American English "translation" systematically converts Anglicisms such as FULL STOP and SOLIDUS to more recognizable American English terms PERIOD and SLASH, for example. It also changes such character standard oddities as CARON into the more recognizable term HACEK. Various corrections for known misspellings or other errors in the normative names are also applied, in the interest of providing American English terms that make as much sense as possible. Of course, many Unicode character names are for highly technical symbols:


or for characters in scripts that English speakers are typically not familiar with and using terms from other languages:


No attempt is made to provide explanatory rewordings of such characters or to translate such script-specific language usage in character names into some analogous phrase in English, as it is unlikely that such rewordings or translations would actually help in identification of the characters.

Instead, the translation simply culls away irrelevant distractions for American English speakers that result from Anglicisms, standardese, and miscellaneous naming mistakes.


2 Data File

The accompanying text file contains the actual translated names list.

American English Translated Names List

The text file uses the same format and syntax conventions as [Names]. See [Format]. This means that, if desired, the translated names list can be manipulated with the same unibook utility program that can be used to view the untranslated names list.

Note that this text file is a plain text file, but for technical reasons is encoded in ISO/IEC 8859-1, Latin-1, rather than in UTF-8.

3 References

[FAQ] Unicode Frequently Asked Questions
For answers to common questions on technical issues.
[Format] Unicode 4.1.0 Names List documentation
[Names] Unicode 4.1.0 Names List
The Unicode 4.1.0 Names List file from which this translation is derived.
[Versions] Versions of the Unicode Standard
For details on the precise contents of each version of the Unicode Standard, and how to cite them.


The following summarizes modifications from the previous version of this document.

1 Initial version