Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

From: Mark Davis (mark.davis@jtcsv.com)
Date: Sat Jul 12 2003 - 17:09:40 EDT

  • Next message: Anto'nio Martins-Tuva'lkin: "Re: Ligatures in Turkish and Azeri"

    We did that deliberately. Faced with a situation where a registration
    authority changes IDs on a whim -- with no regard to the issues of
    stability in software and data -- the best policy is to always use the
    old one, and map any new locales to the old one. That way when you
    exchange IDs between old and new systems, it all continues to work.
    (We did in fact know of the latest version of the standard at the
    time.)

    (In ICU, we did add a more general-purpose aliasing mechanism, both
    for resource bundles and parts thereof.)

    Mark
    __________________________________
    http://www.macchiato.com
    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: "Philippe Verdy" <verdy_p@wanadoo.fr>
    To: "Doug Ewell" <dewell@adelphia.net>
    Cc: <unicode@unicode.org>
    Sent: Saturday, July 12, 2003 00:27
    Subject: Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish
    and Azeri, was: Accented ij ligatures)

    > On Saturday, July 12, 2003 6:51 AM, Doug Ewell <dewell@adelphia.net>
    wrote:
    >
    > > Philippe Verdy <verdy_p at wanadoo dot fr> wrote:
    > >
    > > > Good luck with ISO language codes which does not even
    > > > define them, and contain many duplicate codes even in
    > > > the Alpha-2 space (he/iw, in/id), or unprecize codes
    > > > matching sometimes very imprecize families of languages
    > > > overlapping other language codes...
    > >
    > > The codes "iw" for Hebrew and "in" for Indonesian were deprecated
    > > FOURTEEN YEARS AGO. It is not accurate or fair to refer to them
    as
    > > "duplicates" of "he" and "id". The Registration Authority
    deprecates
    > > such codes, rather than deleting them, for backward compatibility
    with
    > > any data that might contain the old codes.
    >
    > I was sure also that "iw" was not used today, until I found that it
    is
    > still used in Java on Windows, for legacy reasons... Creating a
    resource
    > bundle in Hebrew with the code "he" was simply... ignored. So I had
    to
    > rename it to "iw".
    >
    > Shamely, on Linux or various Unixes the correct code to use for
    locales
    > varies, and it comes from the user-environment settings, actually
    setup
    > by a system profile, most of the time... Users that want to get the
    > benefit of existing locales for Hebrew will constantly need to
    change
    > between "he" and 'iw". The "normal" installation solution is still
    today
    > to create a file link between "he" and "iw" resources, so that they
    both
    > can be used.
    >
    > I was really disappointed when I saw that these legacy language
    codes
    > were not simplifiable the way we think, by ignoring "iw" and "in",
    and still
    > today, Java does not offer a way to create "links" at runtime to
    resolve
    > locales with equivalent ids, without duplicating resources or
    creating
    > special rules with: if ( code="he"|| code="iw" )
    > (don't forget that Java has also run-time resources with no
    files)...
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Sat Jul 12 2003 - 17:56:25 EDT