From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Aug 12 2003 - 08:49:54 EDT
From: "Jon Hanna" <jon@spin.ie>
> I was saying that it wouldn't be sensible to begin a line with a
> combining diacritic, since that combining diacritic would be combining
> with a newline character which it's difficult to think of any possible
> sensible meaning for.
A newline is a control with a whitespace property and a line-breaking
behavior. It must not combine with a combining diacritic, according to
the UAX definition of grapheme clusters.
So <newline>+NSM is clearly defective and must be parsed as two distinct
combining sequences, the first one for the newline sequence, the second
one being "defective" as the combining character does not have a base
character to which it applies (the standard suggests using a dotted
circle to render it in editors, but suggests nothing for the rendering
of final documents, which could simply drop the defective sequence or
display it with a replacement base character, or use a dotted circle, or
a invisible glyph. So the result in this case is implementation
dependant, and not interoperable.
For me the term "difficult" is inappropriate. In fact it is invalid for
interoperability (even though it is valid, not forbidden, for
ISO10646/Unicode, as an string fragment for intermediate processing),
and such sequence should not occur in actual documents, out of any
external processing context which defines its behavior.
This archive was generated by hypermail 2.1.5 : Tue Aug 12 2003 - 09:25:37 EDT