Do `Grapheme_Extend` characters only apply to `Grapheme_Extend`?

From: Mathias Bynens <>
Date: Wed, 23 Apr 2014 22:16:26 +0200

Let’s say I’m writing a program that strips combining characters and grapheme extenders from an input string.

For combining marks, I’m looking for any non-combining marks (e.g. `a`) followed by one or more combining marks (e.g. `̃`), and then I remove everything but the non-combining mark (e.g. leaving only `a`). Is this a correct approach?

What should the approach be for grapheme extenders? Should the program only look for `Grapheme_Base` characters followed by `Grapheme_Extend` characters (which includes the code points in `Other_Grapheme_Extend`)?
Unicode mailing list
Received on Wed Apr 23 2014 - 15:17:45 CDT

This archive was generated by hypermail 2.2.0 : Wed Apr 23 2014 - 15:17:46 CDT