From: Phillips, Addison (firstname.lastname@example.org)
Date: Mon Apr 20 2009 - 23:32:43 CDT
A few notes on this thread. Note that these are *personal* comments, notwithstanding my .sig.
"Language preference" isn't quite the same thing as "locale", although they are closely related. Locale is a programming concept useful in many ways, but mostly to do with APIs.
The Accept-Language header was intended to do language negotiation, but since implementation of it is inconsistent and since managing it is quite arcane, language negotiation via Accept-Language (A-L) alone is usually not fully satisfying. Sites that rely solely on A-L eventually tend to migrate to some form of personalization scheme (such as cookies) to track the actual user preference---even Google does this today. [Implementers should read and understand RFC 4647 and the "lookup" algorithm to avoid spotty performance such as that cited by Peter Krefting below. RFC 2616 is just too vague to make an effective algorithm.]
Geolocation is not as bad as Peter Krefting makes it sound below. I know my general reaction to is has been negative---just because I'm in the Frankfurt airport doesn't mean I want German content, to pay in Euros, etc. However, geolocation can be exceedingly useful for finding "locality", local resources, or when all else fails (uncookied A-L-free browser pointed at a generic URI).
Most sites that do language/locale negotiation end up providing some form of user interaction for managing the language following the negotiation process (hence the prevalence of cookie-ing or URL-rewriting) so that users can get what they want. With multiple ways of getting it wrong, you have to allow the user to adapt.
Overall, the whole thing is a bit of a patchwork mess. Each Web technology seems to choose a different approach, none of which are wholly wrong. And, indeed, there is work to try and address this at W3C. Specifically, the I18N WG is trying to complete work on two documents: "LTLI" (Language Tags and Locale Identifiers) and WS-I18N, promoting other standards (CLDR! IETF BCP 47!) and trying to lobby other W3C working groups (for example, WebApps, sometimes with some success to provide for consistent approaches.
Do note that the latest BCP 47 (RFC4646bis) is in last call at the IETF right now. One thing browser vendors could do is implement it, since that would address some gaps in language coverage as well as the problem of script identification in locale identifiers.
And anyone interested should really consider participating in the W3C Internationalization WG. We could use the help.
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG
Internationalization is not a feature.
It is an architecture.
> -----Original Message-----
> From: email@example.com [mailto:firstname.lastname@example.org]
> On Behalf Of Peter Krefting
> Sent: Monday, April 20, 2009 1:57 AM
> To: Unicode Mailing List
> Cc: email@example.com
> Subject: Re: Determining Locale in a Browser for Web 2.0
> > Will HTTP Accept-Language ever give you any more information than
> It may, or may not. It might even be the same, depending on what
> browser you
> set to "sv-SE,sv;q=0.9,nb;q=0.8,da;q=0.7,en;q=0.6" (I used to
> include "de"
> with a really low score as well, but some buggy servers then always
> sent me
> German instead of English, so I stopped doing that).
> Whether or not you will have a country code or just a language code
> on the browser, its user and the system it is on.
> > So I am just wondering if anyone has been thinking about exposing
> > specific locale information inside of web browsers? For example,
> > browser could just read the OS's locale information and expose
> that in a
> But then you run into the problem of trying to figure out which
> setting is
> authoritative. I am currently running an English OS, but it was
> installed with a Norwegian locale (I live and work in Norway) and
> my user
> is set up for Swedish. Depending on what data software looks at,
> prompt me in either English, Norwegian or Swedish, seemingly
> randomly. A bit
> For web applications, the Accept-Language is usually the one that
> is most
> correct, as people tend to set it up to get Google to work properly,
> if it says just "en" or "en-US", in which case the user didn't care
> change the default.
> Using Geolocation is usually bad, but it depends on what type of
> you provide. I'm quite happy to get Swedish text (from Accept-
> Language) with
> prices in Norwegian currency (from geolocation) when I browse
> flights at my
> friendly local airline operator. But I'm equally unhappy with sites
> that I want Norwegian *text* just because I'm in Norway (and with
> having more than one official language, that becomes even more fun).
> One thing you especially should *not* look at when deciding
> language is
> either the operating system or browser UI language. I know a lot of
> using either or both in English, but wanting another language for
> text (or being forced to use English because the software isn't
> for their language).
> \\// Peter - http://www.softwolves.pp.se/
This archive was generated by hypermail 2.1.5 : Mon Apr 20 2009 - 23:38:12 CDT