Re: RFC 1766 language tags

From: Martin J. Duerst (
Date: Mon Jun 16 1997 - 14:17:47 EDT

On Mon, 16 Jun 1997, Andrew Daviel wrote:

> On Fri, 13 Jun 1997, Jordan Reiter wrote:
> > I personally am fluent in only one language (because I'm an American ;-7),
> > so all of my web pages have been in a single language. I wonder, aside
> > perhaps from character set configuration, what reasons there are for
> > defining the language of a block of text?
> .. well, it might be useful for automated resource discovery, search
> engines with thesauri, etc. etc. - so that if you were in an "en-gb" block
> pavement == footpath, while in an "en-us" block pavement == roadway,
> or setting up speech synthesis systems, etc.

Yes, search is probably most important. Even when the words mean the same,
you may want to search for some language. For example, you might look
for something about Paris in English, and are not interested in
French because you don't understand it. A good search engine will
be able to filter out French and other languages for you if the
texts are tagged appropriately.

> There's also directionality (left-right vs. right-left {Hebrew, etc.}),
> but I think that's addressed by a separate tag.

Yes. There are too many languages, some of them not very well known,
that are writter RTL, and so basing on language didn't work out.

> ... I'm not sure whether CSS addresses things that might change between
> languages, such as number representation, hyphenation, quotations, braces
> etc., but it might be useful to define alternate style sheets for each
> language used on a page ...

I think it doesn't now, but it should do so in the future. Please note
that one has to be careful with mixed text. Even if you choose a
typical French font for French texts, and a typical English font
for English texts, this doesn't mean that every single French word,
sentence, or paragraph inside an English document should automatically
use a different font from the rest of the text. It will rather look
ugly. The same applies to Chinese and Japanese. For all kinds of
combinations, existing practice (which takes much longer to establish
in typography than in computer sicence) has to be examined, and
new practices have to be developped where there are no existing
ones or the existing ones are not appropriate. With the new possi-
bilities, multilingual texts at first will look like the first
experiments of hobby designers on systems with a lot of fonts,
and like the first or second generation of web pages (with blink
and other atrocities). Good design will come later. And in the
case of multilingual documents, it depends crucially on more
and better matching fonts, which require a lot of work.

Regards, Martin.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT