Re: CLDR: 2 vs. 4 digit years in US?

From: Jukka K. Korpela (
Date: Wed Dec 07 2005 - 03:09:36 CST

  • Next message: Cary Karp: "RE: CLDR: 2 vs. 4 digit years in US?"

    On Tue, 6 Dec 2005, Doug Ewell wrote:

    > Rick Cameron <Rick dot Cameron at businessobjects dot com> wrote:
    >> Is the CLDR meant to be descriptive or prescriptive?
    >> If the former, I would say that 06/12/05 is far more common in North
    >> America than 06/12/2005.
    > Good question.

    The heart of the matter, I think. Basically, CLDR is meant to be
    descriptive, but there might be situations where a prescriptive approach
    might be in order. If we know that some commonly used notation often
    causes uncertainty, confusion, and misunderstanding of data, maybe we
    should define a representation format that is more unambiguous.

    However, this should be taken in the context of the linguistic community
    (or group of people) for which the locale is intended. The question is
    whether there is serious risk of misunderstanding within the community.
    When someone has chosen the US English locale, does he understand
    unambiguously what 06/12/05 means?

    I'm inclined into answering "yes", but the opposite answer would make
    sense, too: en_US is commonly the default that a user does not
    override, even when his understanding of English is much less than
    perfect. We would surely like to change this so that people use the locale
    that fits them best, but there are many practical obstacle. It takes a
    long time before users learn to check locale settings wherever they work.
    The locale that fits the user best, in principle, might be incompletely or
    even incorrectly defined; localized versions of software all too often use
    a horrendous language that can only be understood by back-translating it
    into English!

    Thus, due to the lingua franca status of en_US, it might be adequate to
    treat it as different from other locales, with some special requirements
    on universal unambiguity of notations. Ideally, en_US should be
    specifically the US variant of English (itself with variants, of course),
    whereas "universal English" would be the common default. But I'm afraid we
    can't change the vendors' habit of setting en_US the factory setting.

    However, I'm still inclined into saying that 06/12/05 is the
    appropriate shortest format for en_US. The main reason is that although
    disambiguation would be needed, 06/12/2005 wouldn't really disambiguate
    the essential part. It's the June 12 vs. December 6 interpretation that
    is the problem here, rather than the question which part is the year.

    (The "which figure is the year" puzzle is very real for some notations.
    I often find "best before" dates on products in the format 04-06-08.
    I have no direct way of finding out whether the product is good until
    2008 or whether it should have been consumed in 2004 at the latest,
    because the notation is used both ways in practice.)

    > I hate 2-digit years and never write 'em, but if the
    > objective is to describe what most Americans do, well, they write 2 digits.

    Or what most Americans regard as the most natural and familiar. This of
    course almost always coincides what they would write themselves, but
    sometimes it can be argued that a person would find a longer and clearer
    expression more natural than the shorter notation he uses himself.

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Wed Dec 07 2005 - 03:19:15 CST