Soft Hyphens in Complex and East Asian Scripts

From: Richard Wordingham <>
Date: Sun, 27 Apr 2014 23:46:09 +0100

I'm trying to assess the impact of what I regard as a word-processing
bug, and this forum seems to be the best source of information.

What writing systems using 'complex' or 'East Asian' scripts use U+00AD
SOFT HYPHEN in a manner that is potentially visually distinct from

The only good example I have is Thai, and it seems remiss that most of
the 8-bit encodings for Thai don't support invisible line-breaking
opportunities at all.

I do have two probable examples from a book in Tai Khuen (Tai Tham
script) published in Thailand, but they may result from poor editing
or, possibly, be plain hyphens. Both words appear to be proper nouns.
The book has several examples of clear words broken across lines without
any hyphenation.

Are there any 'complex' or 'East Asian' scripts where U+00AD and U+200B
have the same visual effect but are used for different semantics? An
obvious example would be for U+200B to mark word boundaries but for
U+00AD to mark line break opportunities within a word.

Unicode mailing list
Received on Mon Apr 28 2014 - 23:52:42 CDT

This archive was generated by hypermail 2.2.0 : Mon Apr 28 2014 - 23:52:44 CDT