Re: String name and Character Name

From: Edward H. Trager (ehtrager@umich.edu)
Date: Sat Apr 16 2005 - 17:07:49 CST

  • Next message: Hans Aberg: "Re: Unicode lexer"

    On Saturday 2005.04.16 15:15:12 +0300, Jukka K. Korpela wrote:

    > The idea of using CLDR for the purposes of localized names for characters
    > is probably crucial to addressing the problem of misleading official
    > names. It allows each language community to define, to the extent it finds
    > useful and possible, descriptive names that are widely understood within
    > the community. These names could then be used in utilities like Character
    > Map.
    >
    > But defining localized names is a huge task, especially if it needs to be
    > based on some kind of consensus. I would expect that for most language
    > forms, the localized data would consist of the names of characters
    > commonly used in the language itself (broadly speaking). Even this will
    > take quite some time and effort.

    I agree: properly localized names in the CLDR are the right answer.
    A wiki-style web site where users of each language localization community
    could come to consensus about what names they think are correct may
    be the best way to overcome the hugeness of the task.

    > How about the following idea of overcoming the difficulty?
    > 1. Identify the characters with misleading official names.

    Are you volunteering to collect a short list as a starting point?

    > 2. Define better names for them in the "en" locale, and preferably
    > in the "fr" locale as well.

    I don't have time to do this, but if I did I would consider putting together
    a wiki-style project but not limited to en and fr: I would include at least
    all 6 UN languages to start.

    > 3. Enhance CLDR with the feature of combining locales, in the sense
    > that a user's locale choice can consist of a sequence of locales
    > in order of preference. For example, a user's choice could mean
    > "use the 'de' locale for anything defined there but the 'en'
    > locale for things that aren't define in the 'de' locale".

    To the best of my knowledge, Mac OSX already takes this approach to
    displaying localized messages. This is the right answer.

    >
    > That way, when accessing a character with a misleading official name,
    > the information shown to the user would consist of its localized name
    > in the "en" locale (or maybe "fr" locale), unless a name has been defined
    > for it in the user's preferred locale.
    >
    > --
    > Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    >

    - Ed Trager
      Bioinformatics, Kellogg, UM Ann Arbor
     



    This archive was generated by hypermail 2.1.5 : Sat Apr 16 2005 - 16:47:03 CST