From: Mark Davis (mark.davis@jtcsv.com)
Date: Tue Apr 27 2004 - 16:10:34 EDT
You do make some good points -- but I still disagree ;-)
Sorry for not answering earlier -- I've been a bit swamped. Will try to get time
soon to reply.
Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄
----- Original Message -----
From: "Peter Constable" <petercon@microsoft.com>
To: "Unicode List" <unicode@unicode.org>
Sent: Mon, 2004 Apr 26 09:25
Subject: RFC 3066 tags vs. locales (was RE: Common Locale Data Repository
Project
> Mark:
>
> I really feel your usage of terminology here is unhelpful -- in very
> practical ways, unhelpful, because it makes it more difficult to get
> people to understand how to implement things in the right way.
>
> It may be that the application that most interests you is the naming of
> locales, but that does not change the fact that the notions of "locale"
> and "language" are different, and that the primary intent of RFC 1766
> and it's successors has always been identification of "languages", as
> the title and introduction to RFC 3066 indicate:
>
> "Tags for the Identification of Languages"
>
> "One means of indicating the language used is by labeling the
> information content with an identifier for the language that is used in
> this information content."
>
> Whether in your broad or narrow sense, a locale is an operational mode
> of a software application or of a software operating environment to
> provide culture-dependent tailoring.
>
> "Language" in the sense used by RFC 1766/3066 is a
> linguistically-related attribute of content, and a language identifier
> is used to label content to indicate that attribute, or to label
> resources (e.g. spelling checkers) that can appropriately be applied to
> that content. I think that's stated reasonably clearly in RFC 1766/3066
>
> One should also refer to RFC 2277, IETF Policy on Character Sets and
> Languages, which clearly distinguishes "language" tags and "locale"
> tags. In the IETF context, which is the context for RFC 1766/3066, those
> documents provide do *not* provide tags for locales; they provide tags
> for languages.
>
>
> > There is, as I have said, a perfectly reasonable, narrow sense of
> > locale which is essentially identical to what is captured by RFC 3066.
>
> But that does not mean that it's a good thing to refer to RFC 3066 tags
> as locale identifiers.
>
> > And in
> > practice, RFC 3066 is often used with that meaning. I don't see any
> need to deny
> > reality (at least not in this area ;-)
>
> I think you overstate actual practice: For many years, various software
> implementations have used combinations of ISO 639-1 language identifiers
> and ISO 3166 country identifiers joined with an underscore to create
> locale identifiers; e.g. "en_US". It was not until Microsoft's .Net
> Framework that locales ('CultureInfo' in that context) have been named
> using strings that *resemble* RFC 3066 tags -- and it needs to be
> pointed out that the namespace for CultureInfo.Name is not the same as
> the RFC 3066 namespace.
>
> It may be that you and some others have come to refer to RFC 3066 tags
> as "locale" (in some unspecified sense) identifiers, but that
> terminology certainly is not used by all. Indeed, as mentioned above, it
> is counter to IETF practice as described in RFC 2277.
>
> My contention is that it's unhelpful to refer to RFC 3066 as "locale"
> tags. I have no problem with *using* RFC 3066 to name certain locales,
> or to control the operational mode of software processes in certain
> contexts. But saying that RFC 3066 tags are "locale" tags is decidedly
> unhelpful in getting people to understand what are appropriate
> requirements of implementations. While you may have a conceptualization
> that distinguishes between "narrow" and "broad" senses of "locale",
> there are at least some software implementers (and I suspect this
> applies to most) that only know of "locale", without any distinction of
> subtypes. As a result, people inevitably will end up confusing
> namespaces for locales with the RFC 3066 namespace. My concern is that
> this will lead to problems of interoperation, and will potentially
> undermine RFC 3066.
>
> Consider a couple of situations. First, someone needs to define in their
> software a locale for (say) US English but we a 24-hour time format.
> Yes, that falls in your broad rather than narrow sense of locale, but
> there are lots of software implementers out there that don't know the
> difference. All they know is that someone they consider knowledgeable in
> i18n/g11n issues has referred to RFC 3066 tags as "locale tags". So,
> they decide to name their locale "en-US-24hr". Then they write software,
> or document their system leading others to write software, that inserts
> this name into contexts like xml:lang. We know they shouldn't do it, but
> they don't know that; and referring to RFC 3066 as "locale" tagging only
> encouraged them to do this. And once they've done it, it can become a
> problem that all of us have to work around.
>
> Secondly, consider Mongolian. Documents written in Mongolian using
> Mongolian script should be tagged (following the provisions of RFC
> 3066bis) as "mn-Mong". There is no distinction to be made between
> whether these documents were written in Mongolia or in PRC. Therefore,
> there's no need to tag the documents as "mn-Mong-CN" or "mn-Mong-MN".
> But for software locales, this country distinction *is* important. So,
> if a software implementer names their locale "mn-Mong-MN" and then
> assumes they should insert that string into the accept-language header
> of an HTTP request, there's a better than fair chance content will not
> be returned according to what the user would prefer, because what they
> want is "mn-Mong", and that's how the content is tagged, but because the
> software implementer didn't understand that the intent of RFC 3066 and
> the requirements for locales are not the same, the request that was sent
> was overly specific.
>
> So, I will persist in trying to get people to understand that RFC 3066
> tags are not "locale" tags, and ask that you not perpetuate confusion
> that is out there.
>
>
> Peter
>
> Peter Constable
> Globalization Infrastructure and Font Technologies
> Microsoft Windows Division
>
>
This archive was generated by hypermail 2.1.5 : Tue Apr 27 2004 - 17:00:03 EDT