Re: Hyphenation Markup

From: Richard Wordingham via Unicode <>
Date: Sat, 2 Jun 2018 12:37:45 +0100

On Sat, 2 Jun 2018 11:06:43 +0200
Otto Stolz via Unicode <> wrote:

> Am 2018-06-02 um 06:44 schrieb Richard Wordingham via Unicode:
> > In Latin text, one can indicate permissible line break opportunities
> > between grapheme clusters by inserting U+00AD SOFT HYPHEN. What
> > low-end schemes, if any, exist for such mark-up within grapheme
> > clusters?
> What about U+200B ZWSP?

> > this character is intended for invisible word
> > separation and for line break control; it has no
> > width, but its presence between two characters
> > does not prevent increased letter spacing in
> > justification

Thanks for the suggestion, but it's not likely to work:

Within a word and with a proper layout implementation, using ZWSP
would be worse than using backing store <character-1, SHY,

1) In the sequence

<letter-0, character-1, ZWSP, character-2, letter-1>

realisation of the break should definitely result in <letter-0,
character-1> on one line and in <character-2, letter-1> on the next
line, whereas in visual order, character-2 should precede character-1.

2) Use of ZWSP will usually result in a dotted circle even when the break does not occur.

3) ZWSP will result in a mandatory word boundary. That will cause
problems with the spell checker.

I've experimented
( with the
combination <letter, right matra> where there is a default grapheme
cluster boundary between the two characters. I get generally better
results with SHY than ZWSP. The downside was that the rendering
systems I tried seemed to insist on inserting the glyph of U+002D or
U+2010, rather than the glyph of U+00AD.

Incidentally, does CLDR define the rendering of soft hyphen, or is one
entirely at the mercy of the application?

Received on Sat Jun 02 2018 - 06:38:12 CDT

This archive was generated by hypermail 2.2.0 : Sat Jun 02 2018 - 06:38:13 CDT