Re: Variant locales?

From: Peter_Constable@sil.org
Date: Mon Apr 22 2002 - 14:40:06 EDT


On 04/22/2002 12:06:18 PM Michael Everson wrote:

>At 09:32 -0700 2002-04-22, Deborah Goldsmith wrote:
>>I had a recent inquiry from inside Apple as to whether there was a
>>registry of variants of the standard ISO locales, e.g. ja_JP.kana
>>for Japanese written only with kana. Does anyone know if there is
>>any standard that attempts to describe such things?
>
>The cultural registry at DKUUG can do that kind of thing, but I don't
>think it is very highly stocked. For some of these things you can
>combine language and country codes fairly freely.

Before suggesting that one freely combine language and country codes (not
that that will help in this particular case), I'd like to mention the paper
I'll be presenting at IUC21, in which I suggest the need for (and propose a
draft of) a model for language-related categories that are of interest for
IT purposes. As it quite obvious, the distinction Deborah is needing to
make is not language -- it's all Japanaese -- or country -- it's all Japan;
rather, she's distinguishing between *writing systems* -- a notion that is
closely related to but distinct from language. In that paper, I suggest
that country is not generally appropriate for distinguishing writing
systems (because writing system distinctions don't generally match national
borders). Country can be relevant for distinguish *orthographies*, i.e.
spelling conventions and things closely associated with spelling
conventions (e.g. hyphenation), but not for distinguishing writing systems.

In the model I propose, orthographies are defined by qualifying a writing
system, and not the other way around (e.g. US and UK share the same writing
system but use different spelling, thus orthography is the more restrictive
notion). Since country codes are useful in relation to orthographies, and
since orthographies are derived from writing systems, I suggest that
writing system qualifiers should be more closely positioned in relation to
the individual language ID portion of the overal identifier. Thus (using
hypothetical tags) ja-kana-JP rather than ja-JP-kana.

I don't discuss this at length in the paper, but I also think it is not a
good idea in the long run for us to make this a *locale* distinction: just
because there's a difference in writing system, it doesn't mean that a
whole bunch of parameters that aren't necessarily linguistically-dependent
at all (e.g. number formats) are going to change.

Now, Deborah, since you're talking about "standard ISO locales", you
probably have particular implementation constraints that you're operating
under, and you may not have the freedom to do things the way that makes
most logical sense (even though Apple has historically been thought of as
the company that implements things the "right" way). I think we'd be better
off in the long run, though, if "standard ISO" conventions for naming
locales were revisited and brought into conformance with a logical model
that fits the ontological reality being described. (Many would say it would
be even better if the structure and definition of notion locale itself were
revisited -- a common theme on the locales list).

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Mon Apr 22 2002 - 15:26:53 EDT