ISO 639-3 beta input form (was: Questions re ISO-639-1,2,3)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 24 2005 - 11:05:57 CDT

  • Next message: Bruno Lowagie: "Unicode TTF question"

    I see also in the ISO639-3 list of languages that the input form for
    selecting language names lacks an option for languages that don't start by a
    letter (and that also don't have ISO 639-1 or -2 codes), notably these ones:

    * [oun] !O!ung (Scope=I, Type=L)
    * [nmn] !X (Scope=I, Type=L)
    * [alu] 'Are'are (Scope=I, Type=L)
    * [kud] 'Auhelawa (Scope=I, Type=L)
    * [hnh] //Ani (Scope=I, Type=L)
    * [gnk] //Gana (Scope=I, Type=L)
    * [xeg] //Xegwi (Scope=I, Type=E)
    * [gwj] /Gwi (Scope=I, Type=L)
    * [xam] /Xam (Scope=I, Type=E)
    * [huc] =/Hua (Scope=I, Type=L)
    * [aue] =/Kx'au//'ein (scope=I, Type=L)

    They have all Scope=I (Individual language), but Type=L (Living) or Type=E
    (Extinct). Is that because they are still aliases, or still not specified
    completely (notably their standard English names)? If not, then there shoudl
    be a "Other" option in the form input selector.

    Also, for "!O!ung" whose reference is found in Ethnologue as a living
    language (in the Khoisan family) spoken by a small community in Angola, it
    gives also another alias (!O!kung). I'd like to know if these "!" or "/" or
    "=" or "'" are used to replace unencoded characters or diacritics, or it's a
    technical issue on the Ethnologue.com and SIL.org web sites...

    I suspect that "/" means the combining slash overlay, and "//" the combining
    double-slash overlay, I suspect the quote to to mean the apostrophe letter,
    but what does the equal sign mean? I also suspect that those languages don't
    have known orthographies (only spoken for now)...

    Are there projects to include in ISO 639-3 the alias names listed by
    Ethnologue?

    Are there projects to list the ISO 639-3 codes of individual languages
    refered by languages with Scope=C (collective languages, such as
    "Afro-Asiatic (Other)" whose 639-2 code is "afa", or as "Bihari" whose 639-1
    and 639-2 codes are "bh" and "bih", but that won't have ISO 639-3 codes)?
    Same question for Scope=M (macrolanguages)?

    Or instead to include this reference within the meta-data associated to each
    individual language (and so avoiding to change long lists of codes in the
    meta-data associated with collective languages.)?

    Finally, is ISO 639-3 meant to be used for tagging more precisely the
    various written or spoken texts or other localized data? What will be the
    relation of ISO 639-3 with BCP 47 (notably will the ISO 639-1 and -2 codes,
    when they exist, be still preferable to the ISO 639-3 codes? I think it
    should, so that existing documents and localized data won't need to be
    updated with new language codes)

    I also hope that there's no conflict between 3-letter ISO 639-2 codes and
    3-letter ISO 639-3 codes, and that there's already an agreed policy to use
    the same codes if possible (excepting for legacy alias codes in ISO 639-2;
    let's not renew the difficulties found in ISO 639-1 between technical and
    bibliographic codes, or with languages that have had their code changed such
    as Hebrew [iw=>he], or Indonedian [in=>id])



    This archive was generated by hypermail 2.1.5 : Wed Aug 24 2005 - 11:08:05 CDT