RE: Common Locale Data Repository 1.1 beta

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon May 17 2004 - 13:27:16 CDT

Next message: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"

Previous message: E. Keown: "Re: Vertical BIDI"
In reply to: Mark Davis: "Common Locale Data Repository 1.1 beta"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Mark,

I am impressed with the data collected but have problems with the structure and some of the actual data values.

For example if I want to handle date/time data I need time zone info. I may also need country information to parse and format the date as well and language info for things like month and day of week names.

To me mixing country dependant and sub languages dependent data together makes no sense. I have this problem with ICU as well.

Language should be: language, script, country sub language and variant.

The country values should be stored differently. It is a vary bad idea to replicate the same country values in every locale. It is in violation of the principles of normalization besides some variants apply to the country not the languages such as the EURO variant.

The common way locales are passed is with strings. Thus if we use lowercase country to specify a sub language as distinct from country we can have a locale like: "es_mx_US#America/Los_Angeles". In the case of "en_US#America/Los_Angeles" it would be the same as: "en_us_US#America/Los_Angeles".

If you want to maintain compatibility with systems like Windows with LCIDs you can use separate LCIDs for language and country values if you have a mixed environment like "es_mx_US".

"es_mx_US#America/Los_Angeles" is easy to implement in that if the value is language dependent you look under "es_mx". Country dependent data is under "US" and the time zone is "America/Los_Angeles". This greatly reduces not only normalcy problems with the locale data but user databases and provides automatic support for locale combinations with much less effort.

The short time zones should be common only to the specific country that uses them. Even for the US locale they are a mess. Both America/Anchorage and America/Halifax use "AST" (Alaska Standard Time/Atlantic Standard Time".

Carl

> -----Original Message-----
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
> Behalf Of Mark Davis
> Sent: Friday, May 14, 2004 10:09 PM
> To: cldr@unicode.org
> Cc: unicode@unicode.org; unicore@unicode.org
> Subject: Common Locale Data Repository 1.1 beta
>
>
> We are starting the beta process for CLDR 1.1 and LDML 1.1. New comparison
> charts have been generated and are available for review. Because of the
> transition from OpenI18N to Unicode sponsorship, a relatively
> small number of
> changes have been included in this release. Among them are the
> ability to have
> very narrow month and day names for calendar headings, two
> grammatical forms for
> month/day names, and the incorporation of new data and data
> fixes. The release
> period will also be very short due to this transition, so we
> would like any
> additional beta comments in by one week from now, May 21.
>
> Not all fixes slated for 1.1 have been incorporated into the
> charts or the LDML
> document; they will be rolled out in the coming days. (In particular, the
> collation changes have not been yet reflected in the charts.) For more
> information on CLDR, see http://www.unicode.org/cldr/.
>
> Mark
> __________________________________
> http://www.macchiato.com
> ► शिष्यादिच्छेत्पराजयम् ◄
>
>
>
>
>

Next message: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Previous message: E. Keown: "Re: Vertical BIDI"
In reply to: Mark Davis: "Common Locale Data Repository 1.1 beta"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon May 17 2004 - 13:28:05 CDT