Re: Accents of the same combining class displayed side by side

From: Asmus Freytag (
Date: Fri Nov 07 2008 - 03:11:20 CST

  • Next message: Michael Everson: "Re: Accents of the same combining class displayed side by side"

    On 11/7/2008 12:06 AM, Michael Everson wrote:
    > On 7 Nov 2008, at 02:41, Doug Ewell wrote:
    >> As some have already pointed out, there are letters used in
    >> Vietnamese that have diacritics positioned side-by-side, or at least
    >> "sort of" side-by-side. The precomposed forms of these letters
    >> decompose to the base letter, plus the first diacritic, plus the
    >> second diacritic. Of course, these decompositions are immutable.
    > And indeed, Vietnamese does not, therefore, have this problem, because
    > the pre-composed glyphs can be counted on to be used ti provide the
    > correct glyphs.
    These pre-composed characters decompose canonically. You can expect that
    any text created with them can and will be decomposed at some stage, and
    vice versa. Therefore, you cannot make a (durable and interchangeable)
    appearance distinction based on whether there's a decomposed sequence or
    precomposed character.
    >> I would consider it strange if a different application of the Latin
    >> script were to indicate the side-by-side rendering explicitly, by
    >> means of a special "combining mark joiner" control character, while
    >> Vietnamese text would not. It would be inconsistent and surprising,
    >> and it might make developers of fonts and rendering engines think the
    >> marks are not to be rendered side-by-side *unless* the control
    >> character is present, which would cause decomposed Vietnamese to be
    >> rendered in the non-preferred way.
    > There is no real problem, if you want to do it this way, providing
    > that there is no push-back when we propose pre-composed some
    > side-by-side diacritical marks. (We already have precedent for this in
    > a few UPA diacritical marks, so perhaps this will not be problematic.)
    Actually, there should be some push-back :-)

    Like with accents above, the amount of room underneath a character is
    limited, so that one would expect the *normal* typographic treatment of
    narrow and tall accents below to be side-by-side, simply for space
    reasons. A raw stacking behavior would then be rather in the nature of a
    fallback mechanism for poor man's fonts and layout systems. In other
    words, whenever you propose side-by-side diacriticals you should have to
    make the case that the stacked alternative really exists, and really
    must be distinguishable from the side-by-side case.

    When you proposed the UPA, this appeared to a be an isolated special
    case. If you are trying to encode all of the possible combinations of
    diacritics above and below as precomposed entities, I think you are
    making the encoding handle something that belongs on the display side.

    For example, for the Vietnamese sample glyphs I note that the accents on
    upper case characters are truly side by side, while on lower case
    characters there's a bit of stacking, but not fully raised, instead
    offset to the side a bit to keep the stack from getting too tall, but
    making it more narrow to fit on the lower case character. This is
    clearly a typographical adjustment and not a *notational* one.

    No push back should exist where diacritics (have) become effectively
    fused, to the degree that the combination is a novel entity even though
    long time ago it might have been created from identifiable parts.
    Unicode is not a visual tinker-toy where you build up shapes
    graphically, but a character encoding. If the elements are not
    individual *characters* then a fused shape is acceptable as its own
    encoded character.
    >> I don't know how widespread Teuthonista is, but about 44 million
    >> people read Vietnamese.
    > The Vietnamese don't have a problem for the reason stated.
    On the contrary, all Latin characters that are the same base plus same
    accents as one of the "Vietnamese" characters will render the same,
    unless you support language specific glyph adjustments.

    > Michael Everson *

    This archive was generated by hypermail 2.1.5 : Fri Nov 07 2008 - 03:13:44 CST