RE: [africa] Re: Questions: locales; CLDR process; ISO-639 (again)

From: Peter Constable (
Date: Wed Mar 01 2006 - 12:53:49 CST

  • Next message: Donald Z. Osborn: "RE: [africa] Re: Questions: locales; CLDR process; ISO-639 (again)"

    > From: [] On
    Behalf Of
    > Donald Z. Osborn

    > Quoting John Cowan <>:

    Hmmm... evidently some msgs from this list aren't getting past our spam

    > Also I note that the locale form needs language code and country code.
    > trying to make arguments here, but to understand how best to use the
    > system and
    > all the various codes.

    Keep in mind that a locale is different from a language. Don't confuse
    the need to use a country ID to reflect a regional dialect or spelling
    differences ("language" identification) with the need to include a
    country ID to reflect processing parameters associated with a country
    such as default currency (locale identification). Language distinctions
    are always part of a locale, so when a country ID is needed for language
    distinctions the language ID can look the same as a locale ID. But
    there's a logical distinction: locale IDs generally include a country ID
    since locales generally have some country-based data, but not all
    language tags require a country ID.

    > > Work on RFC 3066ter, which will incorporate ISO 639-3 tags, has not
    > > formally begun. The intention of most of the various players,
    however, is
    > > to use a design in which a language encompassed by a 639-3
    > > will have a two-part language subtag, of the form zh-yue
    > > So 639-3 code elements for languages that are *not* macrolanguages
    > > be added directly, but code elements like yue will not: yue will
    > > exist in Internet language tags as part of the compound subtag
    > Thanks for this clarification. Actually the "nesting"of the '3 codes
    > under a '1
    > or a '2 code makes a lot of sense. Two questions:
    > 1) Can one file a locale before 3/15 using this format "ff-ffm-ML"
    even though
    > the design is not yet oficial?

    If you mean file a locale into CLDR, that's a question for the CLDR
    list, not this list.

    > Beyond that I see that there may be a lot of discussion on the roles
    > and use of
    > the different codes in the case of different (macro)languages. In teh
    case of
    > Arabic, for example, would a simple ar-EG be enough or would you need
    > alternatively want to rule out) ar-arz-EG (arz=Egyptian spoken
    Arabic), while
    > at the same time allowing perhaps that less widely spoken dialects in
    > country be noted?

    Standard Arabic is used across Arabic-speaking countries and is
    generally the preferred variety for text. This is what would almost
    certainly be used in Arabic locale data. Thus, ar-EG is probably the
    most appropriate for this case. If someone is specifically using a
    locale for creating and working with content or resources in arz, then
    ar-arz-EG might be an appropriate locale -- but note, it would be a
    different locale than ar-EG.

    > But today, if we were filing two locales for Kpelle, what would be the
    > coding? I'm assuming that kpe-LR annd kpe-GN would be the best (or
    least bad)
    > choices even if later the xpe and gkp have to be added?

    Again, a question for the CLDR list.

    > So another question (sorry these are accumulating) is what kpe-xpe-LR
    > kpe-gkp-GN locales would offer to a group localizing for Kpelle "kpe"
    as a
    > transborder, multidialect (macro)language?

    At this point, I think that's a question for the language communities to
    decide, not us.


    4. Going back to ISO-639 in general (I know this subject has been
    discused before but please bear with me), is there going to be any kind
    of feedback between the processes of developing locales and localization
    on one hand and amending the list of ISO-639 codes on the other? I
    recall there being some mention of a block on new ISO-69-1 and 2 codes,
    or that a 1 code will not be given where there is a 2 code, but that
    *maybe* a new 1 and 2 code could be given (Runyakitara might be a
    candidate for the latter). Also mention of possible additional ISO-639
    codes beyond the three ranges already. What is the latest on all this?
    > >> 4. Going back to ISO-639 in general [...]
    > >> What is the
    > >> latest on all this?
    > >
    > > I think, but I am not sure, that no new 639-1 codes can be added
    > > 639-3 goes into effect. (In principle, a language missed by 639-3
    > > be added simultaneously to -1, -2, and -3, but the chance that such
    > > language both has been missed and meets the criteria for -1 is

    The JAC loosely committed not to add something to -1 that was already in
    -2. (I say "loosely" meaning that they did not rule out the possibility
    that circumstances might change in the future mandating a need for a new
    alpha-2 where an alpha-3 already existing in -2.) The JAC has never made
    a similar commitment wrt -3. But, we were just recently discussing the
    future of -1, and while this specific concern wrt -3 didn't come up, we
    were thinking that we should further constrain -1 so that requests to
    add alpha-2 would no longer be accepted from anybody but could only come
    from an ISO member body. This would really reduce the number of requests
    we get for alpha-2 IDs.

    > > Any 639-3 language could be added to 639-2, using the same code
    > > for it in both parts of the standard.

    639-2 will become a subset of the union of 639-3 and 639-5 (the latter
    for collections); there will be a single alpha-3 code space. The
    criteria for inclusion in 639-2 is likely to get further constrained
    from what it is now. In effect, 639-2 will become a profile of alpha-3
    of interest to a particular user community; the TC46 reps to the JAC
    will be working on a proposal for how we define that user community.

    > I'm thinking that language change, planning, and engineering would
    > for some
    > flexibility on this...

    There's no question that the plane of language varieties will change,
    especially in developing nations as language planning and development
    activities bring greater standardization and stabilization of languages.
    This will be one of the challenges we face in language coding, and
    perhaps also in software implementations. One thing to keep in mind is
    that something like a software localization has potential to be a
    significant factor in how the sociolinguistic scenery evolves.

    Peter Constable

    This archive was generated by hypermail 2.1.5 : Wed Mar 01 2006 - 12:59:56 CST