Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

From: Peter Kirk (
Date: Tue Aug 05 2003 - 16:47:46 EDT

  • Next message: Pim Blokland: "Re: Handwritten EURO sign"

    On 05/08/2003 09:42, Jim Allan wrote:

    > Peter Kirk posted:
    >> If I want to do this, should I explicitly encode a dotted circle, or
    >> should I encode nothing and expect the font to generate the dotted
    >> circle, as it often does?
    > I think that practise of a font or application automaticaly inserting
    > a dotted circle under an orphaned combining character is dubious
    > compliant with Unicode specifications.
    > ...
    Thanks, Jim, for all this data, but now I am totally confused. Well, at
    least it seems clear that if I want a dotted circle I should explicitly
    encode it. But if I don't...

    Suppose for example I want to write a sentence like "In this language
    the diacritic ^ may appear above the letters ...", but instead of ^ I
    want to use a combining character, a regularly positioned centred above
    the letter diacritic, which does not have a defined spacing variant. I
    don't want a dotted circle. And I want it to be spaced as here, i.e.
    with one space before the diacritic and one after it. It seems to me
    that at one place in the standard I am told to encode space - combining
    mark - space, for the combining mark will not combine with the space
    because the space is not a base character; and in another place I am
    implicitly told to encode space - space - combining mark - space,
    because the second space acts as a carrier for the combining mark.

    I hope that wanting to display this correctly is not another place where
    I "have stepped over the boundaries of what is reasonable to expect
    plain text to convey", but that this too can be "grist for the Unicode
    5.0 mill to grind very finely" - both quotes from Ken Whistler earlier
    today. And I think that if this issue is clarified it will also become
    clear what should be done about string initial holam and alef etc.

    Perhaps a simple way ahead would be to define a new character something
    like COMBINING MARK HOLDER with no glyph, which is defined specifically
    for this purpose, is a base character and not a format character, and is
    expected to be just as wide as is necessary to display the combining
    mark. Then we could say that a spacing accent is equivalent (possibly
    even canonically if made a composition exclusion?) to COMBINING MARK
    HOLDER plus a non-spacing accent, and remove the misleading
    compatibility equivalences to SPACE plus a non-spacing accent.

    Peter Kirk

    This archive was generated by hypermail 2.1.5 : Tue Aug 05 2003 - 17:35:45 EDT