[OT] Re: relation between unicode and font

From: Andrew Cunningham (andjc@ozemail.com.au)
Date: Fri Jan 05 2001 - 22:04:41 EST

Hi everyone,

actually there is a bug in the browsers, or at least in internet explorer.
Its been there in versions 4,5,5.5

Yes a lot of 8-bit fonts exist, Many of these 8-bit fonts follow MIcrosoft's
codepages rather that the iso-8859 series, in that that place characters in
the C1 zone.

For instance, if i was creating a vietnamese page in VISCII encoding, I'd
associate the VISCII fonts with the user defined encoding in the web
browsers. This works fine in Netscape, but doesn't work in Internet

For some reason only known to Microsoft, since version 4 of their browser
... the User Defined slot carries out a similar conversion to the Western
(Windows) encoding ... the characters in the C1 zone are remapped based on
Win-1252 to the appropraite values in Unicode. Why this mapping was ever
applied to the user defined slot, I'll never know.

If you prepare a VISCII web page containing all the lower case Vietnamese
vowels, you'll discover that some of the vowels can not be displayed in
internet explorer at all. While Netscape 4.x passes these through as is and
will display.

Unicode is a boon these days .. it menas I can create a Vietnamese web page
that can display on netscape AND internet explorer ...

Any custom 8-bit encoding that has characters in the C1 zone may have the
same problem.

working with multilingual public internet access becomes problematic .. IE
is only suitable for encodings that have inbuilt support in the browser ..
and useless for encodings like VISCII that are tarnsformed by the browser
making some of the characters undisplayable ...

one of the reasons that my industry hasn't widely accepted internet explorer
as a default browser. It cann't handle the langauges we need to use,
community langauges rather than commercial languages.

and also one of the reasons that we try to encourage the use of unicode.


Andrew Cunningham
Multilingual Technical Project Officer
VICNET, State Library of Victoria


----- Original Message -----
From: Yung-Fong Tang <ftang@netscape.com>
To: Unicode List <unicode@unicode.org>
Cc: Unicode List <unicode@unicode.org>
Sent: Saturday, January 06, 2001 6:29 AM
Subject: Re: relation between unicode and font

> Not really a browser bug. It is a bug in the FONT. Some of the font
> basically claim they are design for a certain encoding which 0x00-0x7F
> represent ASCII while the glyph in that font in those position have
> shape in non ASCII. If font author *lie* to browser, in the information
> which encoded in the font, there are no thing the browser (or browser
> developer) can do.
> Jukka.Korpela@hut.fi wrote:
> > On Thu, 4 Jan 2001, sreekant wrote:
> >
> >
> >> <font face="Tikkana">A B </font> is being shown as some telugu
> >> characters.
> >
> > That's basically a browser bug, though some people have seen it
> > as a method of extending character repertoire. It has absolutely
> > nothing to do with Unicode. For an explanation of the fallacy, see
> > http://ppewww.ph.gla.ac.uk/%7eflavell/charset/fontface-harmful.html
> > http://babel.alis.com/web_ml/html/fontface.html
> >

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT