displaying Unicode text (was Re: Transcriptions of "Unicode")

From: Mark Davis (mark@macchiato.com)
Date: Wed Dec 06 2000 - 20:59:15 EST

Let's take an example.

- The page is UTF-8.
- It contains a mixture of German, dingbats and Hindi text.
- My locale is de_DE.

From your description, it sounds like Modzilla works as follows:

- The locale maps (I'm guessing) to 8859-1
- 8859 maps to, say Helvetica.
- The dingbats and Hindi appear as boxes or question marks.

This would be pretty lame, so I hope I misunderstand you!!

What people really want at a minimum is for all of the characters in the document to be displayed legibly, if there are fonts installed on the system that can display the individual characters. That is, if I have Helvetica, Zapf Dinbats, and Pratiksa (a Devanagari font), for the above page the browser could use Helvetica for the German text, Zapf Dingbats for the dingbats and Praktisha for the Hindi text, switching as necessary.

I agree that if the web page author wanted a *specific* rendering of, say, CJK characters with a Japanese style -- or even more precisely, with a Hon Mincho font -- then that requires markup. Similarly with Polish vs. French, or Serbian vs Russian styles. However, it is not exactly rocket science to display the characters with *some* reasonable font, if one that can handle the characters is installed, so it would be disappointing if it worked as above.

----- Original Message -----
From: "Erik van der Poel" <erik@netscape.com>
To: "Unicode List" <unicode@unicode.org>
Cc: "Unicode List" <unicode@unicode.org>
Sent: Monday, December 04, 2000 22:08
Subject: Re: Transcriptions of "Unicode"

> Mark Davis wrote:
> >
> > What wasn't clear from his message
> > is whether Mozilla picks a reasonable font if the language is not there.
> Sorry about the lack of clarity. When there is no LANG attribute in the
> element (or in a parent element), Mozilla uses the document's charset as
> a fallback. Mozilla has font preferences for each language group. The
> language groups have been set up to have a one-to-one correspondence
> with charsets (roughly). E.g. iso-8859-1 -> Western, shift_jis -> ja.
> When the charset is a Unicode-based one (e.g. UTF-8), then Mozilla uses
> the language group that contains the user's locale's language.
> In other words, Mozilla does not (yet) use the Unicode character codes
> to select fonts. We may do this in the future.
> Erik

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT