L2/19-041

 

Linebreak property value of U+00AD SOFT HYPHEN

Eric Muller, Amazon

January 10, 2019

 

 

The character U+00AD SOFT HYPHEN has the linebreak property value BA. In typical use between letters, this creates a linebreak opportunity between it and the following letter.

However, this opportunity needs special treatment if it is to be exercised (e.g. insertion of an hyphen glyph). In fact, it is not a linebreak opportunity, but an hyphenation opportunity.

Many layout engines that determine their linebreak opportunities using a stock implementation of UAX#14 either tailor it or post-process the result to ignore this linebreak opportunity. I have seen at least three independent implementations doing this, and I believe that Chromium does that too. So the current value of the linebreak property creates needless complications.

The best way to handle SOFT HYPHEN for linebreak purposes is to “fuse” with a preceding letter, or treat it as a letter if there nothing before it. This is just what LB9 and LB10 do. Therefore, the proposal is to change the linebreak property value of U+00AD SOFT HYPHEN from BA to ZWJ.