The Unicode Consortium Discussion Forum (CLOSED)

When are graphme clusters "meaningless"
Page 1 of 1

Author:  guest [ Thu Dec 23, 2010 7:59 pm ]
Post subject:  When are graphme clusters "meaningless"

Another issue raised on the list:
The "grapheme cluster boundary" algorithm sems to quietly allows building meaningless "graphemes" such as base-less (sequences of) combining codes. What are we expected to do with them?

Author:  RichardWordingham [ Sun Feb 17, 2013 10:05 am ]
Post subject:  Re: When are graphme clusters "meaningless"

When it comes to displaying them, there are two main options if they consist entirely of non-spacing marks. The first is to display them on <U+00A0 NO-BREAK SPACE>. The second is to give an error indication, e.g. by displaying them on <U+25CC DOTTED CIRCLE>, possibly breaking up the sequence.

There are many options for rendering a spacing mark plus non-spacing marks. There are occasions when non-spacing marks are intended to be treated as though letters.

If a search string starts with a baseless cluster, I would say that was a very good argument for ignoring any 'complete graphemes only' setting when looking for the starting boundary of the matching string.

Page 1 of 1 All times are UTC - 6 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group