Re: Criteria for including characters: typographic issues

From: Antoine Leca (
Date: Fri Dec 22 2006 - 07:49:11 CST

  • Next message: Doug Ewell: "Re: Question about new locale language tags"

    Jukka K. Korpela wrote:
    > It seems to me that people who propose new characters that could also
    > be regarded as icons forget that a character is, by essence, subject
    > to typographic variation. [...] If text is presented in italics or
    > in bold, such rendering modifications should apply to all characters

    While I agree with you about size or boldness, and can agree if you add
    slanting (obliqueness), I feel italics is really a feature specific to Latin
    (and derived, like Peter the Great's Cyrillic) scripts and its typographic
    evolution, particularly in the forming years of the 16th century, where two
    different [ISO15924] scripts for the minuscules (Carolingian-derived versus
    Aldine's script-derived) were used concurrently and thereafter mixed.
    I do not believe such a multi-script feature can be ported to all scripts,
    and while it can be imitated by slanting, and has been so, long ago for the
    Latin capitals or more recently to handle HTML <EM> tag, as you described, I
    do not believe we should force any script to allow such a distinction.

    Why capitals (which are initially yet another script, also in concurrence)
    or hiragana/katakana were kept encoded separately in Unicode, while italic
    or arabic variations are not, is result of encoding history, but I do not
    believe it should be used as a base of discrimination to encode new

    Also see the Georgian case, it is even more comparable to Latin: we have
    three "input" scripts, and two were encoded in Unicode... The result as
    encoding does not seem to be optimal, however.

    > What about the reservations? I recently realized that some special
    > characters may need special treatment, especially in italics.
    > Originally, italics means font style, a result of typographic design
    > that uses glyphs that resemble handwriting to some extent. Italics
    > letters are generally more slanted, but italics is far from simple
    > slanting. However, in sans-serif font design, italics fonts are often
    > rather similar to regular (upright) variant, just slanted, with
    > relatively small modifications. Moreover, many fonts lack italics
    > versions, and if text in such a font is "italicized" (using a program
    > command or markup), programs just perform slanting to regular glyphs.
    > To produce reasonably noticeable difference, they typically slant
    > quite a bit.
    > This means, for example, that if you have "|" and "\" in Arial
    > Unicode MS (which lacks an italic variant) and italicize the text,
    > "|" becomes slanted and looks more or less like "/", whereas "\"
    > becomes roughly like "|" or worse. This is bad, and it should
    > probably not happen, but it does. So if you managed to introduce a
    > new special character into Unicode, is this what you'd like to happen
    > to it?
    > Algorithmically "italicizing" a character may obscure or distort the
    > character badly, or it may just make it somewhat odd. Slanting the
    > copyright sign does not make it unrecognizable, but it doesn't do any
    > good either.

    Looks like a long way to say that the <EM> tag, and its implementation in
    browsers relying on pre-i18n technologies where fonts were divided with two
    bits, called "bold" and "italic", does not really scale up.


    This archive was generated by hypermail 2.1.5 : Fri Dec 22 2006 - 07:52:00 CST