Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)

From: Mark Davis ([email protected])
Date: Wed Jun 14 2006 - 20:47:16 CDT

Next message: John Hudson: "Re: Glyphs for German quotation marks"

Previous message: Kenneth Whistler: "Re: Mnemonics for LAO LETTER HO TAM"
In reply to: Philippe Verdy: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"
Next in thread: Erkki Kolehmainen: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Actually, the cases discussed do involve fallback from one family to
another, such as from (a particular kind of) Sami to Norwegian, or Breton to
French. We had quite a number of discussions about this, and finally
concluded that we should not build the fallback in at a low level, since the
particular fallback may very well be an individual preference, but that the
best we should do would be to have a structure that suggested a default
fallback for the locale, which could be overridden according to the user
preference.

Mark

On 6/14/06, Philippe Verdy <[email protected]> wrote:
>
> I already submitted some remarks there, but it's been a long time, and
> the CLDR has evolved (as well as the ICU library) and my initial comments
> may look outdated regarding the new developments.
>
> But this bug repport is not really discussing the fallback mechanism from
> one language to a language family, but from a variant to a language, or the
> fallback for languages that have multiple codes or legacy codes (he/iw,
> in/id) as seen in Java VMs where the legacy codes (like iw, in) are still
> the only one working given that it preserves the compatibility of old
> applicaitons that depended on them for finding their resources with the
> standard class loader of Java 1.3/1.4 (and even 5.0).
>
> I still hope that the successor of RFC 3066 will come soon to describe
> correctly the new locale identifiers (and especially the new ISO 15924 field
> for the indication of scripts).
>
> But gien that ISO 639-3 is still not finalized, it will be hard to find a
> definitive solution for designating locales and all their known aliases, and
> still preserve the compatibility of legacy applications depending on these
> identifiers.
>
> ICU for now proposes a temporary solution for resolving the resource
> fallback path, but it certainly requires more thoughts to handle all
> possible cases (and the interaction of language identifiers with ISO 3166
> country/region identifiers, or the new aliases introduced now by deprecating
> the ISO 3166 country/region identifiers in favor of more precise ISO 639-3
> language identifiers);
>
> The current locale fallback mechanism implemented in legacy applications
> is most often fixed and various systems use different fallback algorithms to
> determine alternate locales. In Java for example, this mechanism also
> interacts not only with the user settings, but also with the local system
> settings, when no user locale matches with a given resource id. But there's
> still no way in Java to go after the first field of the locale id, as its
> parent is a single root, and not another locale.
>
> Even the java Locale class still does not include a constructor to specify
> the script identifier (one could specify it in the variant identifier, but
> its place at the third position after the country identifier is not the best
> one for correct locale resolution, as this should be on the second
> place between the language code and the region code). If one uses the field
> normally reserved for the country to set the script code, it won't interact
> cleanly with legacy applications that use country codes.
>
> So one must use its own class cloader, using its own fallback mechanism,
> and create a new class to extend the Locale object, and implement variuous
> tricks to make it work with the standard locale interface. This is more or
> less what ICU does to support extended locale identifiers and aliases.
>
> ----- Original Message -----
> *From:* Mark Davis <[email protected]>
> *To:* Philippe Verdy <[email protected]>
> *Cc:* Erkki Kolehmainen <[email protected]> ; Cristian Secară<[email protected]>;
> [email protected]
> *Sent:* Wednesday, June 14, 2006 9:08 PM
> *Subject:* Re: are Unicode codes somehow specified in official national
> linguistic literature ? (worldwide)
>
> There is a planned mechanism: see
> http://dev.icu-project.org/cgi-bin/locale-bugs?findid=698
>
> (This was planned for 1.4, but delayed since we didn't have enough data to
> warrent adding the mechanism.)
>
>

Next message: John Hudson: "Re: Glyphs for German quotation marks"
Previous message: Kenneth Whistler: "Re: Mnemonics for LAO LETTER HO TAM"
In reply to: Philippe Verdy: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"
Next in thread: Erkki Kolehmainen: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 21:03:48 CDT