Re: Using combining diacritical marks and non-zero joiners in a name

From: Asmus Freytag (
Date: Sun Apr 20 2008 - 01:19:46 CDT

  • Next message: Rick McGowan: "IUC 32 reminder - abstracts due this week"

    On 4/19/2008 12:14 PM, Jukka K. Korpela wrote:
    > Asmus Freytag wrote:
    >> It's a deliberate limitation of Unicode conformance that it focuses
    >> its requirements on the *identity* of the character, not on the finer
    >> points of typography. In other words, the conformance seeks to ensure
    >> that writers know which characters to use to designate a combination
    >> of base and mark, and receivers know when they receive the data, which
    >> combination was intended.
    > That's the big picture, but there's still the question how poor the
    > rendering can be. For example, if "overprinting" implementation makes a
    > diacritic practically unrecognizable, is it conformant? What if it is
    > _barely_ recognizable, which means that it is not recognizable to many
    > people?
    Let's give an example: it is certainly conformant to use a font that
    makes the digit 1 and lowercase l identical, as many typewriters used to
    do. It's also conformant to use a font that makes the lowercase l and
    uppercase I look identical, as some deliberately bare-bones sans serif
    design might do. In this case, you can argue that there's a logical
    (stylistic) reason for the limitations of the font, not merely
    incompetence. However, as you point out, I tried to argue, that short of
    deliberate *mis*-identification, the problem of poor identification of
    characters in rendering is not so much a conformance issue but an issue
    of poor vs. good typography.

    > I don't think there are any fixed rules. The Unicode Standard says, more
    > or less, that you must not render an "A" as a "B", but in a world of
    > confusables, that's not very exact, and it's not really presented as a
    > conformance requirement, as far as I can see.
    Precisely. Unicode is not a prescription for a final-form document
    format. It has no business regulating the *precise* appearance or
    rendering of any character. The test therefore is not whether you can
    distinguish the characters on any given system using any given font, but
    whether, when exported to a system with high-quality typographic support
    where the fonts are tuned for recognizability of characters out of
    context, you do get to see the intended text, or not.
    > Diacritics themselves are confusable (think about caron vs. breve), and
    > their rendering is a further complication. A perfectly legible diacritic
    > can become mysterious when rendered in a wrong position. I think the
    > bottom line is that conformance requirements cannot deal with such
    > issues.
    And they therefore don't deal with them.
    > Rather, we can judge pragmatically that _in a given context_
    > some rendering is so poor that it is unnacceptable, wrong - not as a
    > matter of conformance to the Unicode Standard but other requirements.
    Correct - you can apply a typographic usability or quality yardstick,
    which in itself depends on the use you want to make of the text. The
    range of acceptable renderings of arbitrary sets of characters varies
    whether you have ordinary text (possibly even only monolingual) in mimd,
    or whether you need to support mathematical documents (with some rather
    stringent needs to distinguish certain glyphs, even across scripts and
    font styles) or whether you have a UI situation in a security sensitive
    > For example, in a context where diacritic marks frequently appear on
    > uppercase letters, the "overprinting" approach seems to be just _wrong_,
    > whereas in a more typical situation, it's just poor quality, or, at
    > times, acceptable quality.
    I would not hesitate to call overprinting a "poor" solution in any case.
    Given the wide availability of fonts and rendering libraries that can do
    much better, it's simply not state-of-the-art any longer, not even for
    baseline text support. Its presence should be limited to where such
    fonts and/or libraries are unavailable for now (perhaps certain types of
    small devices).


    This archive was generated by hypermail 2.1.5 : Sun Apr 20 2008 - 01:23:53 CDT