Re: [Not OT] localized names of the Unicode Control characters

From: Philippe Verdy (
Date: Thu May 29 2003 - 20:00:50 EDT

    From: "Patrick Andries" <>
    > From: Philippe Verdy (
    > >Microsoft displays these French translations for character names. There are
    > >however some "strange" translations that lack a common formal format that
    > >allows easier searching for related characters.
    > I would be interested off-line (et en français) to learn about these «
    > strange » translations. I do not dispute this could be the case, some are
    > inherited and some (especially the block names) are Microsoft's own name and
    > differ from the ISO 10646 names, and of course some may be due to the French
    > translation team.
    > >I did not know that ISO10646 lists French versions of these names, and
    > >wonder if this is normative.
    > They are.
    > >If so, why aren't these French names listed by some derived Unicode file
    > >(which would combine the UCD >with the file ISO10646 French names) ?
    > A file similarly formatted to
    > exists here
    > .

    Thanks for this reference (and also thanks to pointing this excellent French translation of the ISO10646/Unicode standard).

    This file seems to match the French translation used in Windows XP's charmap accessory (with a few composition problems, as there are some comments added after some names in lowercase between parentheses, that should have better been inserted on a separate line marked with a "*").

    So I think names in both Windows and this Hapax page come from a ISO10646 normative reference file in French, and it contains the names for Unicode3.2 characters (but still not new characters added or modified in Unicode 4.0)

    It's just a shame that Windows XP does not let us see the normative English name of characters (I need to look them by loading the large UCD file in a text editor).

    The strange translation is however in the names of Unicode character blocks in the French version of charmap (which are simply wrong, because they were not translated from an ISO10646 reference or a Unicode reference).

    Some strange names are for some private use area characters that Microsoft allocated to support some of its codepages, but there are also cases where the strange names occur in both the "ListeDesNoms.htm" file above, and in Charmap, but that can't be easily searched. Microsoft only implemented the canonical names (plus a few dictionnary search orders for ideographic characters), but no aliases or usage comments in Charmap...

    Also, as this alternate translation help understanding the semantics of a character, it should be published by Unicode, without requiring us to look for and buy a copy of the ISO10646 standard. After all the English names are normative in both Unicode and ISO10646 and synchronized. Why wouldn't Unicode also reference the ISO10646 French names?

    May be also, the ISO10646 has other normative translations (Chinese? Spanish?) that may help if they are available. However official ISO working languages are English and French, and a few other official translations in some specific standards where the other language is absolutely required, such as standards related to transliteration to foreign languages, notably for Russian, Greek, and probably Chinese Pinyin.

