Re: proposal for new character 'soft/preferred line break'

From: Jukka K. Korpela <>
Date: Mon, 10 Feb 2014 09:53:37 +0200

2014-02-10 9:13, Philippe Verdy wrote:

> The <wbr> is enough for this purpose,

No, since the purpose was clearly to specify a line break point that is
preferred over other possible line break points, or even the only
allowed line break point within a string.

The <wbr> tag (an old nonstandard tag, now being standardized in HTML5)
would not have been needed if browsers had supported U+200B. It is
nowadays debatable which one should be used (U+200B has the disadvantage
of not being supported by IE 6, a still somewhat significant point). But
in any case, they are for allowing direct line break points, nothing more.

> A browser could even use them to give higher priority to break lines,

That would be rather arbitrary and won’t happen; there is no good reason
for that.

> What you want is just to hint the line breaker in the renderer on where
> the linebreaks are the best beneficial. This is really something that
> does not belong to plain text, but to the presentation layer, and HTML
> for example is reach enough about such presentation layer

In rendering software, the choice between line break opportunities is
usually either a very simple one (put as many characters on a line as
possible) or a complicated layout decision that tries to optimize the
spacing between words at a paragraph level. I don’t think there is much
room for any layout instructions at any layer, beyond interactive fine
tuning where a human user instructs the problem to split at specific
point and sees what happens, or prevents a specific break.
Theoretically, it is an interesting idea to consider control characters
or markup for line break opportunities with different preferability, but
in practice, it would be too complicated as compared with the possible gain.

> In my opinion the encced SHY character is there only for legacy reasons
> (compatibility with older encodings when renderers had no good option to
> break words. But in HTML SHY is not needed and <wbr> will work better.

They are completely different things. You might be confusing <wbr> with
&shy; (which is just a named reference for SHY, useful when you want it
to be visible in source code).


Unicode mailing list
Received on Mon Feb 10 2014 - 01:54:48 CST

This archive was generated by hypermail 2.2.0 : Mon Feb 10 2014 - 01:54:48 CST