Re: Test your web browser! Unicode 5.0 charts in HTML on French Wikipedia

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Nov 29 2006 - 05:43:25 CST

  • Next message: António Martins-Tuválkin: "Re: Misuse of encoded characters"

    I know this reply comes late, but ie6 or ie7 does not automatically use Arial Unicode MS, even if it is installed; instead it uses some default fonts based on large ranges, and then seems to uses Tahoma as the default for most scripts that have a native implementation in Windows, or some East Asian font for Han.
    It also uses some fonts that are preset in the per-script fonts setting panel (a few ones are preconfigured, for example for Hebrew)
    But if the user has set up in its browser with more fonts for the selectable scripts, it uses them; so you can effectively have IE display those scripts without being forced to "Arial" or whatever has been set for the Latin script.

    The main problem is in the Wikipédia default fonts selection which is inherited on all elements of the body by the default Monobook.css stylesheet: this contains some fonts and it forgets a lots of other; but because of this CSS stylesheet, whatever the user has set in its browser will be ignored, so if a character is not found in the CSS-selected fonts, IE uses its internal default core fonts for the scripts supported by Windows (in that case it is Tahoma for Latin for the standard "serif" pseudo-font family, and "Times New Roman" for the standard "sans-serif" pseudo-family, or "Courrier New" for the standard "monospace" pseudo-font family; note that Wikipedia's Monobook.css stylesheets include these standard pseudo-families as the last font option, so the browser-specific builtin last-resort fonts are used).

    The last-resort fonts in Firefox are much more detailed, and try to find a matching font according to:
    (1) the user's per-script font settings: this is still not performed by IE6 or IE7 for the last-resort (i.e. when a web page specifies, for the text elements to render, one or more explicit font families including "monospace" or "serif" or sans-serif", either in style attributes or in embedded CSS rules or in external stylesheets)
    (2) the properties of fonts enumerated on the system
    so it makes a much better rendering for missing glyphs, without requiring a website to specify a large list of candidate fonts.

    Note also that the IE control panel contains no option to allow selecting more fonts for scripts not available in the list of scripts (for example Tifinagh is not present in the combobox, so one cannot preset IE with a Tifinagh font, even if it is installed on the system, and if IE can render Tifinagh correctly with that font when the webpage specifies its name explicitly.)

    Firefox will find a Tifinagh font automatically on the system, even if it's not preselected in its control panel or set up manually by the user, only by looking at the properties of installed fonts enumerated on the system during browser startup.

    ----- Original Message -----
    From: "James Kass" <thunder-bird@earthlink.net>
    To: "Philippe Verdy" <verdy_p@wanadoo.fr>; <unicode@unicode.org>
    Sent: Wednesday, October 25, 2006 4:09 AM
    Subject: Re: Test your web browser! Unicode 5.0 charts in HTML on French Wikipedia

    >
    > Philippe Verdy wrote,
    >
    >> You're wrong. The article names use the capital on Unicode. Look
    >> at the effective title displayed after you link there.
    >>
    >> What I forgot is the "wiki/" part in the full URL.
    >
    > O.K..
    >
    >> But even then, Wikipedia
    >> proposes you the correct name and redirects you immediately to
    >> the articles (so even these links DO work).
    >
    > Earlier I only read the 404 file not found error message. But, I see
    > that Wikipedia does propose the correct link. However, on my system
    > the automatic redirection to the correct link does not work, even after
    > waiting a full minute. Maybe my browser security settings forbid
    > such redirection.
    >
    > Unfortunately, many of the charts display with "missing glyph" boxes,
    > even though I have fonts installed and the browser properly configured
    > to display these characters. This seems to happen mostly in the Latin/
    > Greek/Cyrillic ranges. What might be happening is that the author,
    > in the CSS sheet, has specified a list of fonts beginning with "Arial Unicode MS".
    >
    > When MSIE 6.0 finds a page linked to a style sheet such as this, it first looks
    > for "Arial Unicode MS". If the browser does not find "Arial Unicode MS", it
    > is very happy to display the missing glyph boxes from regular "Arial" and
    > disregards the rest of the font suggestion list.
    >
    > This must happen to a lot of other people, too. Many people do not have
    > the "Arial Unicode MS" font, but the regular "Arial" font is ubiquitous.
    >
    > The web pages for the Universal Declaration of Human Rights in many
    > languages suffer from this same "error". They are not readable on
    > many systems. (Unless they've been fixed since the last time I looked.)
    >
    > Best regards,
    >
    > James Kass
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Nov 29 2006 - 05:43:56 CST