Re: Unicode HTML, download

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Nov 22 2004 - 13:13:10 CST

  • Next message: Peter Kirk: "Re: Unicode HTML, download"

    Isn't WGL4 now deprecated, in favor of MES-1 to MES-3 European
    recommandations, that include WGL4 (and most ISO 8859 or Windows ANSI
    charsets) within MES-1 or MES-2, and leave MES-3 open for all new additions
    in Unicode to European scripts ?

    Are there fonts fully compliant with MES-2 anyway, given that these MES
    subsets are now left unmaintained, and that MES-3 is just an open container
    with no strict definition, with which fonts can comply only for a
    predesignated Unicode version?

    Shouldn't there instead exist standard sets per language/country for
    internationalization (I know that ICU integrates data about language
    coverage, but it is still not a standard, even if some of its data are being
    integrated into the new Unicode-hosted CLDR project)?

    To support all E.U. official languages, it would just be needed to support
    the union of these subsets per language. For font designers, it would be
    also a simplification, because fonts could more clearly be labelled to cover
    languages rather than character subsets.

    It's a shame that users can't simply select their prefered fonts according
    to languages they cover (and also probably an opportunity for extensions in
    the OpenType or similar font formats); for web browsers, this would also
    allow better automatic selection of fonts to use if HTML documents are
    properly specifying the language of text sections they contain (as they
    should with the lang="?" attribute and with the xml:lang="?" attribute in
    XHTML).

    Selection of fonts per script is a nightmare for most users, because fonts
    don't always cover all the needed subset for a language, producing partial
    rendering or inconvenient/horrible rendering with multiple font designs with
    various metrics.

    Just think about how you select the prefered font for "Unicode" encoded
    documents in their browsers... Users will probably select Arial Unicode MS
    or Lucida Console if they have these fonts, but will then be able to display
    reliable only European languages in Latin/Greek/Cyrillic or modern
    Hebrew/Arabic scripts, without having any way to support also other
    languages that DO need the Unicode encoding, in absence of other suitable
    charsets.

    Some good examples: Georgian, Armenian are not covered by those Unicode
    fonts, as well as many letters of the new ISO 8859 Latin Celtic standard set
    created by Michael Anderson, and even some Latin-based African languages
    widely spoken and written in Europe like Berber, are not covered correctly
    with those "Unicode" fonts so authors need to create documents with too many
    text encoding hacks (like the inclusion of Greek approximative similar
    letters, or specific style markup for only a few letters).

    ----- Original Message -----
    From: "Peter Kirk" <peterkirk@qaya.org>
    > Well, of course if you are using languages which use only characters from
    > WGL4, you will use only these characters. So I could only understand your
    > recommendation as being to write only in languages which are supported by
    > WGL4, and not in other languages. And in practice that means, use only
    > European languages.



    This archive was generated by hypermail 2.1.5 : Mon Nov 22 2004 - 13:15:50 CST