Re: Unicode HTML, download

From: Doug Ewell (
Date: Sun Nov 21 2004 - 11:35:17 CST

  • Next message: Peter Kirk: "Re: Unicode HTML, download"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > So if she really wants to include character compositions which are
    > only possible with Ezra SIL, she will need these two classes:
    > <style type="text/css"><!--
    > .he { font-family: "Arial Unicode MS", David, Myriam, Tahoma, Arial,
    > sans-serif;}
    > .heb { font-family: "Ezra SIL" }
    > .he, .heb { direction: rtl; }
    > //--></style>
    > and use preferably the "he" class name for all Hebrew characters which
    > can be represented with Unicode code points and Unicode fonts found in
    > common browsers, surrounding only the specific sections requiring the
    > SIL encoding mapped on ISO-8859-1 within <span class="heb"> elements.

    Absolutely not. No way. A document should NEVER contain text in two or
    more character encodings with changes indicated only by font

    This approach will destroy searching capabilities, and will not ensure
    proper rendering in any event. The user who has Miriam but not Ezra SIL
    (or vice versa) will see some Hebrew text rendered properly and some
    improperly, for no apparent reason. This is worse than either the
    all-Unicode or all-Ezra approach. Don't do it, Elaine.

    The only time a document should EVER be presented in mixed encodings is
    for direct illustration of encoding issues (intended for Unicode
    weenies) or in a MIME-like setting where the document is divided into
    logical sections, with the encoding of each section clearly indicated.
    This is true for all types of documents, not just Web pages.

    If Elaine suspects that some of her HTML will not be displayed properly
    with commonly available Unicode fonts, she will have to bite the bullet
    and either:

    (a) code the whole page in Unicode, and provide a link to a
    comprehensive-enough Hebrew Unicode font, OR

    (b) code the whole page in the legacy encoding, and provide a link to
    Ezra SIL.

    Cryptically naming these two CSS classes ".he" and ".heb", which
    provides no indication of which is the Unicode encoding and which is the
    Latin-1 hack, merely makes a bad suggestion worse.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sun Nov 21 2004 - 11:38:01 CST