Mark-up to Indicate Words

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Wed, 15 Jul 2015 08:49:13 +0100

What mark-up schemes exist to show that a sequence of letters and
combining marks constitutes a single word?

Such mark-up would be useful when using spell checkers. At present, I
use U+2060 WORD JOINER (WJ) to indicate the absence of a word boundary.
(Systematic marking of boundaries using ZWSP is not popular with
users, and is normally not used in Thai - it's not supported in
their national or Windows 8-bit encodings.) However, it seems likely
that when Unicode 8.00 is defined in August, WJ will suppress line
breaks but not word breaks. There would still be the limitation that
mark-up is not available in plain text.

It appears that, for example, Open Document Format has no mark-up to
indicate word boundaries, relying instead on the overrides of
the word boundary detection algorithms being stored at character level.

Richard.
Received on Wed Jul 15 2015 - 02:50:40 CDT

This archive was generated by hypermail 2.2.0 : Wed Jul 15 2015 - 02:50:41 CDT