Re: Questions re ISO-639-1,2,3

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Aug 23 2005 - 03:32:59 CDT

  • Next message: Theo Veenker: "Re: ldml dtd"

    From: "Doug Ewell" <dewell@adelphia.net>
    > ISO 3166-1 alpha-2 and alpha-3 code elements are almost identical in
    > their stability (or lack thereof). I can find no instances in the
    > 31-year history of ISO 3166 where an alpha-3 code element was changed
    > while the corresponding alpha-2 code was left unchanged. (If you can
    > find one, please accept my apologies.)

    Yes alpha-3 codes can change for a country, but in fact alpha-3 codes have
    still not been reassigned to different countries, unlike alpha-2 codes. So
    changes of alha-3 codes just changes the old official code into an alias.

    For example ROM changed to ROU, but ROM was not reassigned to another
    country.

    The reassignments of alpha-2 codes to different countries is the main
    problem for use in locale codes that require longer stability than dated
    statistics.

    What this means is that the alpha-2 codes need to be dated to be
    disambiguated.

    > The numeric code elements (henceforth "codes"), which are really UN
    > codes rather than ISO codes

    That's what I said (UNSD means United Nations' Statistics Division if this
    was not clear)

    > are usually considered more stable, but it
    > depends on what kind of stability you are looking for. ISO alpha codes
    > change when the name of a country changes (or whenever the country feels
    > like changing it; see Romania). UN numeric codes change when the
    > territory covered by the code changes. Normally the latter event is
    > less frequent than the former, but the reverse can also happen; in 1993,
    > the numeric code for Ethiopia changed from 230 to 231 (because of the
    > loss of territory to Eritrea) while the alpha codes remained ET and ETH.

    OK, but 230 has *still* not been reassigned (it could easily, given the much
    smaller encoding space for numeric codes which are geographically
    structured), so it has become an alias for Ethiopia (such alias would remain
    valid for references to documents speaking about the country before the
    split, or composed with localization meta-data; of course documents speaking
    about the country after the split should use the new code, to avoid the
    ambiguity with Erithrea, but this would not invalidate the past references;
    but this would be true for any country code, including the CIO 3-letter
    country codes, or other standards).

    My opinion is that the UNDS wants to keep the possibility to make historical
    searches in its data, without mixing in the same result list the statistics
    of unrelated countries or territories. This is however less a problem for
    UN, given that statistics are necessarily dated (this is not the case for
    many documents needing locale code markup or meta-data).



    This archive was generated by hypermail 2.1.5 : Tue Aug 23 2005 - 03:37:08 CDT