Re: Unicode HTML, download

From: Peter Kirk (
Date: Mon Nov 22 2004 - 13:38:14 CST

  • Next message: Christopher Fynn: "Exporting Unicode UTF-8 from Word (was: Re: utf-8 and unicode fonts on LINUX)"

    On 22/11/2004 19:13, Philippe Verdy wrote:

    > ...
    > Selection of fonts per script is a nightmare for most users, because
    > fonts don't always cover all the needed subset for a language,
    > producing partial rendering or inconvenient/horrible rendering with
    > multiple font designs with various metrics.
    > Just think about how you select the prefered font for "Unicode"
    > encoded documents in their browsers... Users will probably select
    > Arial Unicode MS or Lucida Console if they have these fonts, but will
    > then be able to display reliable only European languages in
    > Latin/Greek/Cyrillic or modern Hebrew/Arabic scripts, without having
    > any way to support also other languages that DO need the Unicode
    > encoding, in absence of other suitable charsets.
    > Some good examples: Georgian, Armenian are not covered by those
    > Unicode fonts, as well as many letters of the new ISO 8859 Latin
    > Celtic standard set created by Michael Anderson, and even some
    > Latin-based African languages widely spoken and written in Europe like
    > Berber, are not covered correctly with those "Unicode" fonts so
    > authors need to create documents with too many text encoding hacks
    > (like the inclusion of Greek approximative similar letters, or
    > specific style markup for only a few letters).
    Philippe, I share your concerns here. A font which claims to cover
    "Latin" may not include the schwa needed by Azerbaijani etc. One which
    claims to cover "Cyrillic" may only cover the subset used by the major
    Slavic languages. One which claims to cover "Greek" may be restricted to
    monotonic modern Greek. One which claims to cover "Hebrew" may have no
    accents for biblical Hebrew. The various selection mechanisms don't work
    at all well in such cases, one ends up with square boxes or with
    sometimes atrociously ill-fitting substitutes. The only safe way, at the
    moment, is to use fonts which actually cover the whole of certain
    Unicode blocks, and not the kinds of subsets defined by WGL4.

    For this reason I applaud what SIL has done in releasing Doulos SIL
    not to be confused with SIL Doulos!), which aims to cover all Latin or
    Cyrillic script characters, as well as Ezra SIL
    which offers the same for the Hebrew, and Galatia SIL
    for Greek (but not Coptic). The font names may be confusing, but the
    concepts are good. Gentium
    also supports almost all Latin and Greek.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Mon Nov 22 2004 - 14:38:00 CST