From: Marco Cimarosti (
Date: Tue Dec 03 2002 - 04:48:04 EST

    Mark Davis wrote:
    > While not a trivial task (about 400 terms), it is many, many
    > times easier than translating all the significant character
    > names. That might someday be worth considering for the
    > Common XML Locale Repository
    > (

    The problem is not the number of terms involved (400 strings is not a big
    deal: it corresponds to a small localization project), but rather the utter
    idiosyncrasy of Unicode-related terminology.

    Terms such as "title case", "caseless", "reordering" or "combining" are
    nearly impossible to translate satisfactorily in other languages, and even
    simple terms such as "character", "letter" or "ideograph", can be tricky, if
    they have to remain distinguished from each other.

    I tried myself to translate the Unicode glossary in Italian, but I still
    have to find satisfactory translations for several entries, although I
    knocked to experts of disciplines ranging from typography (I still don't
    have a valid equivalent for the "case" in "case sensitive", etc.) to
    Hebraism (I am still odds with "cantillation marks").

    IMHO, if such an effort is really worth doing, it should be organized and
    promoted by the Unicode Consortium itself, rather than in an OEM library
    like ICU.

    I suggest that the 400-odd property and value name be listed in a text file
    on the Unicode FTP site (with each English term well commented and
    explained) and translations be collected on a voluntary basis has was done
    for the "What's Unicode?" text. The copyright on this material should grant
    free and unrestricted usage to any implementation such as ICU.

    _ Marco

