Re: Questions on ZWNBS - for line initial holam plus alef

From: Philippe Verdy (
Date: Tue Aug 12 2003 - 08:49:54 EDT

    From: "Jon Hanna" <>

    > I was saying that it wouldn't be sensible to begin a line with a
    > combining diacritic, since that combining diacritic would be combining
    > with a newline character which it's difficult to think of any possible
    > sensible meaning for.

    A newline is a control with a whitespace property and a line-breaking
    behavior. It must not combine with a combining diacritic, according to
    the UAX definition of grapheme clusters.

    So <newline>+NSM is clearly defective and must be parsed as two distinct
    combining sequences, the first one for the newline sequence, the second
    one being "defective" as the combining character does not have a base
    character to which it applies (the standard suggests using a dotted
    circle to render it in editors, but suggests nothing for the rendering
    of final documents, which could simply drop the defective sequence or
    display it with a replacement base character, or use a dotted circle, or
    a invisible glyph. So the result in this case is implementation
    dependant, and not interoperable.

    For me the term "difficult" is inappropriate. In fact it is invalid for
    interoperability (even though it is valid, not forbidden, for
    ISO10646/Unicode, as an string fragment for intermediate processing),
    and such sequence should not occur in actual documents, out of any
    external processing context which defines its behavior.

