Re: String name and Character Name

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Apr 18 2005 - 16:09:22 CST

  • Next message: Markus Scherer: "Re: String name and Character Name - script in locale ID"

    From: "Mark Davis" <mark.davis@jtcsv.com>
    > It is pointless to keep asking for the names to be changed, deprecated, or
    > replaced. Because of the Unicode stability policy, that simply will not
    > happen, as should be clear to you from the many responses on this topic.

    It would be a good idea to give a hint about why the standard names cannot
    be changed now:
    think about Unicode regular expressions that use the "\N{STANDARD UNICODE
    CHARACTER NAME}" character specifier. If this name is changed, this may
    simply break regular expressions used in applications and locale data that
    use this name.

    (My opinion is that these regular expressions should better have been
    written using "\uNNNN" or "\U000NNNNN" with the hexadecimal codepoint... but
    people are often tired to look at the exact codepoint in a charmap...)

    So these standard names are made for TECHNICAL use such as regexps. They are
    not intended to be displayed to users.

    For GUI applications like charmaps or input method editors that allow
    searching a character by name, the standard Unicode/ISO/IEC 10646 name is
    not the best fit, also because they are not translated to the user
    language... For such applications where character names must be exposed to
    users, another localizable name list is certainly better.

    One question there: which names are displayed by the Windows "CharMap"
    accessory? I note that they are not necessarily the standard name, and
    Microsoft has tweaked this list in its French version. Has Microsoft created
    versions of "Charmap" containing character name lists with other languages
    than English and French?

    Where do the character names and classification found in the Chinese or
    Japanese or Korean IMEs come from? The UniHan database?

    I'm quite sure that there already exist somewhere such localized name lists.
    Shouldn't they become part of a common standard, even if they are not in the
    Unicode or ISO/IEC 10646? Why not sharing this information in a localization
    project like the CLDR?



    This archive was generated by hypermail 2.1.5 : Mon Apr 18 2005 - 16:10:42 CST