RE: Generic base characters

From: Kent Karlsson (
Date: Mon Jul 16 2007 - 13:50:52 CDT

  • Next message: Jukka K. Korpela: "Re: FW: Subj: Amount of Space Unicode Takes"

    Asmus Freytag wrote:
    > > 1) I'm not so sure about that. It's better to have a single defined
    > > behaviour (assuming the characters in question are at all supported).
    > In cases like this, you not only have the question of which
    > *characters*
    > are supported, but also which *character sequences* are
    > supported. Just
    > like a font designed for some language other than Swedish
    > might have a
    > glyph for the f, and the j, but, which despite supporting an
    > fi and fl
    > ligature does not support an fj liagature, other parts of the layout
    > system may legitimately not support some sequences even if
    > they support
    > each letter and similar sequences.

    While I would appreciate if more fonts supported the fj ligature,
    I would expect no rendering system or font to insert a dotted
    circle between an f followed by a j just because they don't
    support that ligature. Instead they just output an f followed
    by a j, though the result sometimes is not perfect (but much
    better than getting a dotted circle in-between).

    > This is not a conformance issue but
    > one of quality and scope of an implementation.

    True. But, while formally conforming, it is still a bad idea
    to start inserting dotted circles where there is none in the input.

    > > 2) NBSP base is for sequences of combining characters preceeded
    > > by beginning of string or by a control char. I think using NBSP
    > > as the implicit base in such cases is a reasonable behaviour.
    > > (Inserting a dotted circle is not.)
    > >
    > I've always understood that recommendation to be aimed a
    > preventing the
    > combining mark from being handled in completely weird ways, e.g. by
    > trying to overhang it into empty space at the beginning of a
    > line. I see

    I would say that trying to overhang it into empty space at the
    beginning of a line is much LESS weird than getting a dotted cirlce
    there (where none was in the text).

    > nothing in the standard that prevents a higher level
    > protocol, such as Uniscribe, to override this behavior.

    Formally, no, but it is still a bad idea.

    > > 3) This thread started talking about there actually being a base
    > > present in the text just before the combining sequence, just that
    > > the base was in another script (or some symbol/punctuation).
    > > That is not an error case from a text rendering point of view.
    > > There is no reason to start inserting dotted circle, NBSP,
    > > or anything else. Ligation, kerning, postioning adjustments
    > > are unlikely to work except for special cases, but some rough
    > > approximate (assuming again that the individual characters at
    > > all are supported by the rendering system and font used) should
    > > be output.
    > >
    > As I have pointed out, I regard the application of the policy
    > to these
    > cases as one of the 'issues', because it can lead to unintended (and
    > limiting) results.


    > But I can understand why layout engine creators don't
    > want to support an anything goes approach, because doing that
    > at *high quality* is extremely expensive.

    Yes, but *high quality* for the unexpected cases was not required.
    Just that one got a reasonable approximation; dotted-circle-less.

    > That said, a better way to do the
    > fallback would be appropriate. Johns suggested list of
    > generic bases is
    > a good way to indicate a minimal level of support.

    I agree, for getting better-quality display, but there is still
    no reason to insert dotted circles for other cases.

    > >> Authors should not have
    > >> an expectation of portably exchanging buggy text with perfect
    > >>
    > A buggy text is one that has missing base characters. That's
    > how I meant
    > this usage in my post. If you construed that differently
    > based on some
    > real or perceived deficiency in how I worded that, I'm sorry.

    I did not. But that was not the case that this thread was mostly
    concerned with.

    > But also, indicating that a renderer can't support something, *is*
    > legitimately the business of the implementation. I think that
    > software
    > that uses fallbacks for diacritics and that can't rais stacked
    > diacritics properly would be better off causing a visible
    > clash or even
    > spacing the combining marks than silently overstriking them.
    > As another example.

    That would be better than inserting spurious dotted circles.

            /kent k

    This archive was generated by hypermail 2.1.5 : Mon Jul 16 2007 - 13:52:00 CDT