From: Jukka K. Korpela (firstname.lastname@example.org)
Date: Wed Dec 07 2005 - 03:09:36 CST
On Tue, 6 Dec 2005, Doug Ewell wrote:
> Rick Cameron <Rick dot Cameron at businessobjects dot com> wrote:
>> Is the CLDR meant to be descriptive or prescriptive?
>> If the former, I would say that 06/12/05 is far more common in North
>> America than 06/12/2005.
> Good question.
The heart of the matter, I think. Basically, CLDR is meant to be
descriptive, but there might be situations where a prescriptive approach
might be in order. If we know that some commonly used notation often
causes uncertainty, confusion, and misunderstanding of data, maybe we
should define a representation format that is more unambiguous.
However, this should be taken in the context of the linguistic community
(or group of people) for which the locale is intended. The question is
whether there is serious risk of misunderstanding within the community.
When someone has chosen the US English locale, does he understand
unambiguously what 06/12/05 means?
I'm inclined into answering "yes", but the opposite answer would make
sense, too: en_US is commonly the default that a user does not
override, even when his understanding of English is much less than
perfect. We would surely like to change this so that people use the locale
that fits them best, but there are many practical obstacle. It takes a
long time before users learn to check locale settings wherever they work.
The locale that fits the user best, in principle, might be incompletely or
even incorrectly defined; localized versions of software all too often use
a horrendous language that can only be understood by back-translating it
Thus, due to the lingua franca status of en_US, it might be adequate to
treat it as different from other locales, with some special requirements
on universal unambiguity of notations. Ideally, en_US should be
specifically the US variant of English (itself with variants, of course),
whereas "universal English" would be the common default. But I'm afraid we
can't change the vendors' habit of setting en_US the factory setting.
However, I'm still inclined into saying that 06/12/05 is the
appropriate shortest format for en_US. The main reason is that although
disambiguation would be needed, 06/12/2005 wouldn't really disambiguate
the essential part. It's the June 12 vs. December 6 interpretation that
is the problem here, rather than the question which part is the year.
(The "which figure is the year" puzzle is very real for some notations.
I often find "best before" dates on products in the format 04-06-08.
I have no direct way of finding out whether the product is good until
2008 or whether it should have been consumed in 2004 at the latest,
because the notation is used both ways in practice.)
> I hate 2-digit years and never write 'em, but if the
> objective is to describe what most Americans do, well, they write 2 digits.
Or what most Americans regard as the most natural and familiar. This of
course almost always coincides what they would write themselves, but
sometimes it can be argued that a person would find a longer and clearer
expression more natural than the shorter notation he uses himself.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Wed Dec 07 2005 - 03:19:15 CST