When are graphme clusters "meaningless"
Author:  guest [ Thu Dec 23, 2010 7:59 pm ]
Another issue raised on the list:
The "grapheme cluster boundary" algorithm sems to quietly allows building meaningless "graphemes" such as base-less (sequences of) combining codes. What are we expected to do with them?

Author:  RichardWordingham [ Sun Feb 17, 2013 10:05 am ]
When it comes to displaying them, there are two main options if they consist entirely of non-spacing marks. The first is to display them on <U+00A0 NO-BREAK SPACE>. The second is to give an error indication, e.g. by displaying them on <U+25CC DOTTED CIRCLE>, possibly breaking up the sequence.

There are many options for rendering a spacing mark plus non-spacing marks. There are occasions when non-spacing marks are intended to be treated as though letters.

If a search string starts with a baseless cluster, I would say that was a very good argument for ignoring any 'complete graphemes only' setting when looking for the starting boundary of the matching string.

