RE: TR35

From: Peter Constable (petercon@microsoft.com)
Date: Thu May 13 2004 - 18:01:03 CDT

  • Next message: E. Keown: "L2/04-159 http://elainerk.win.aplus.net/samarpro.pdf"

    > You speak as if date or number formats had nothing to do with language. I
    > very
    > much disagree. If I have message that says: "The date of the last version
    > of
    > this document was 2003年3月20日", nobody in their right mind would say
    > that that is
    > correct English.

    I never said they would. The correct analysis of that content is that it has two runs that are in different languages. (So, AFICT your example does not prove anything.)

    > The core of what anyone means by locale is the language -- and that means,
    > in
    > our context, written language, thus including script (Cryl vs Latn) and
    > variants
    > (such as US vs UK spelling).

    I have been putting "language" in quotation marks because the category types involved include writing system and orthography -- you've heard my presentation on that, so you know that I agree with you on that particular point.

    As for "language" being the core of what anyone means by locale, I have most certainly said that "language" is one of the defining components of a locale. There may even be situations (translation software being an example) in which the processing mode does not care about anything else. But in general, locales -- software processing modes tailored for cultural user preferences -- *do* involve other non-linguistic components. Even in an example like translation software where such non-linguistic components are not needed, the infrastructure for managing the processing mode is working in terms of parameter bundles that *do* include non-linguistic components. And distinctions for such non-linguistic components are not in any situation I can think of useful things to declare regarding linguistic documents.

    > The choice of language affects most of what
    > people
    > traditionally associate with software globalization, including date, time,
    > number, currency, formatting & parsing; segmentation (words, lines);
    > collation
    > and searching; resource bundle choice for translated text & appropriate
    > icons,
    > etc.

    C'mon, Mark. Certainly a choice of language affects how something like a date is displayed, but it is not the only factor. If I tell you that my language is English, even English with US spelling, that does *not* tell you how I want my numbers, dates, times, etc. formatted. It may give you a hint, and that hint may even lead you to do what I want; but it also might not. (IIRC, you yourself prefer to use a date format that is *not* what most systems would guess at from being told that your language preference is US English.) Therefore it is plainly *not* the case that "language" is all that anybody means by locale. Thus, the premise of your statement

    > So if that is all of what someone means by locale, then there is little
    > point in
    > distinguishing between "locale IDs" and "language IDs".

    is not established, and thus the implication is not established.

    You are making broad, general comments without considering carefully enough how things are really used. To repeat something I said earlier, it would not be a good idea to design a transaction-processing system that makes assumptions about how to interpret formatted number or currency strings from a language preference, or even from being told what locale was set on the originating system; I need to know exactly what determined the formatting of the string I received. *That* is an example of the level of discussion of scenarios that needs to happen before any meaningful statements about what a "language" or "locale" ID is and how it should be used. It simply is not good enough to say "people traditionally associate [language] with ... date [etc.]". You are trying to justify wrong (IMO) conclusions using inadequate analysis.

    Locales in general *do* involve things beyond "language", and it is wrong to put declarations specifically for such non-linguistic things into an attribute like xml:lang, and therefore (for instance) entirely unhelpful to refer to RFC3066 tags as locale tags, as though there were no difference.

    I think 20 years of practice in software design have gotten many people stuck in a rut, but the fact that people have thought in a given way for twenty years doesn't make it right or desirable.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Thu May 13 2004 - 18:02:17 CDT