Re: Internationali?ation

From: Misha Wolf (
Date: Fri Oct 17 1997 - 15:55:51 EDT

Bernard Chester wrote (to and

> That's because the focus is wrong. Its not codepages, nor languages
> that you focus on when doing Localization, its locales.
> A locale captures all of these cultural expectations. A locale has a
> specific language dialect, and conventions like number and currency
> display.
> FR is a language (ISO 3316); ca is a country (ISO 639); FR-ca is a
> locale designator (this is a different French than used in Paris!)

Indeed, that is a particular model, though not a very useful one in the
context of the *World Wide* Web, as it mixes levels.

Data should be encoded using universal schemes: Text should be encoded
using the Universal Character Set (Unicode), tagged with language
information to enable operations such as conversion to speech, hyphenation,
line breaking, spell-checking, culturally-sensitive glyph-selection and so
on. Numerical data, such as dates, amounts of money, etc, should be
encoded using the appropriate canonical form, for instance the ISO 8601
standard for dates, the ISO 4217 standard for currency codes and so on.

The *locale* comes into its own in the realm of user preferences, for both
input and output. Users may want to see dates displayed as:


The stored date is none of these, but is rather: YYYY-MM-DD.

  Misha Wolf Email: 85 Fleet Street
  Standards Manager Voice: +44 171 542 6722 London EC4P 4AJ
  Reuters Limited Fax : +44 171 542 8314 UK
12th International Unicode Conference, 8-9 Apr 1998, Tokyo,
   7th World Wide Web Conference, 14-18 Apr 1998, Brisbane,

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
Reuters Ltd.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT