Re: Soft Hyphens in Complex and East Asian Scripts

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Wed, 30 Apr 2014 21:43:03 +0100

On Wed, 30 Apr 2014 17:46:57 +0000
Koji Ishii <kojiishi_at_gluesoft.co.jp> wrote:

> Korean has U+00AD encoded in their legacy encoding, so they may have
> typographic rules for it, but I’m not very familiar with Korean. As
> far as I searched for KLREQ[1], I could not get a hit.

> [1] http://www.w3.org/TR/klreq/

Thanks for the link. Reading it leaves me uncertain as to whether one
should expect to encounter U+00AD within a Korean word, but the part of
the issue may be how it is to be rendered.

I found some very relevant reading at the Cascading Style Sheets
literature. http://www.w3.org/TR/css3-text/#hyphenate appears to
reveal the existence of soft hyphens in Arabic text. Santhosh
Thottingal has been doing some well-received work on hyphenation in
Indian scripts (see e.g.
http://thottingal.in/blog/2013/03/17/hyphenation-in-web ), and the only
criticism I could see was in the rendering of the active soft hyphens.
There is a suggested solution at
http://dev.w3.org/csswg/css-text-4/#hyphenate-character , though I'm
not sure that there will always be a character with the right glyph.

On the basis of this information, I'm happy to contend that U+00AD can
be found in words in many non-'Western' scripts. I can't even be
beaten by a claim that ZWSP is the character for an invisible soft
hyphen.

Richard.
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Wed Apr 30 2014 - 15:44:20 CDT

This archive was generated by hypermail 2.2.0 : Wed Apr 30 2014 - 15:44:20 CDT