Re: Common Locale Data Repository Project

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Apr 24 2004 - 11:26:46 EDT

Next message: Peter Kirk: "Re: Variation selectors and vowel marks"

Previous message: Peter Constable: "RE: Common Locale Data Repository Project"
In reply to: Peter Constable: "RE: Common Locale Data Repository Project"
Next in thread: Peter Constable: "RE: Common Locale Data Repository Project"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

From: "Peter Constable" <petercon@microsoft.com>
> > For now, the only workable solution to solve these issues is found in
> > supplementary libraries in ICU which support locale aliases. (Yes I
> > use the terme Locale because this is the term that Java gives to this
> > identification,
>
> NO. That is the term Java (and other things) give to a *different*
> identification. There are languages, there are cultures/locales. The two
> are not the same.

Then there will remain a problem in Java locales, unless the Java community
accepts that the language part of a locale will contain will the language
subtags of RFC 3066 or its successor, so that the API can implement a language
resolver for that part only, ignoring the second and third parameter that will
be used only to specify other (non-language) elements of a Locale.

For now it's well known that if you create a Java application with resources
bundles for Hebrew, you have to use the "iw" language parameter to name your
bundle; if you use "he", then the same properties file or class part of a bundle
will not be found on a OS that the Java runtime determines as supporting the
"iw" locale, and the application will then display only the default locale (most
often English). Note that Hebrew is part of the set of fully supported languages
in Java. I doubt that the JRE will be changed to use now the "he " code by
default as long as the locale resolver in Java is not updated to use a more
clever algorithm than just equality of language codes.

Same problem for the Simplified Chinese language: Java supports it natively only
with the "TW" country code separately from the "zh" language code. If things
must change later, the Java runtime should learn to work with a "zh-Hant"
language identifier to be used in every country where the language is used.
Using "zh_TW" (i.e. a separate "zh" language code and the separate "TW" country
code) has the bad effect of also applying other locale standards appriate only
for Taiwan, but not for Macau, Hong Kong, Singapore, the Reunion and other
Indian Ocean, South Asian and South African countries or territories where this
language is used with other national locale conventions (currenty, time and
numeric formats, phone numbers...)

In fact I would like to see that "Traditional" and "Simplified" Chinese are
distinct languages in the same family. And an application would better use "zht"
and "zhs" language codes to make the distinction, so that "zh" would become an
identifier for a family of Han-written languages, rather than a language
identifier, and so a legacy code. This means also changes in the Locale resolver
so that a OS and user locale which indicates "zhs" or "zht" will first look for
resources marked with their respective language code, and later will attempt to
use a "zh" resource if not found.

A Locale resolver should be able to determine, from each properties or class of
a bundle, which codes it may support, and a degree/priority of matching face to
other localized resources. But I have not seen anything that suggests that an
application may be able to provide such Locale resolver; for now each
application has to write its own resolver to map a user locale to a matching
application-defined supported locale. The automatic resolver in Java (but other
systems like POSIX have the same caveats) seem quite ill, as well as the
resolution order (a bit more general) currently suggested in RFC 3066 which is
exactly what was implemented in Java...

Next message: Peter Kirk: "Re: Variation selectors and vowel marks"
Previous message: Peter Constable: "RE: Common Locale Data Repository Project"
In reply to: Peter Constable: "RE: Common Locale Data Repository Project"
Next in thread: Peter Constable: "RE: Common Locale Data Repository Project"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Apr 24 2004 - 11:51:45 EDT