Re: Are Named sequences always going to be graphemes?

From: Ken Whistler <kenw_at_sybase.com>
Date: Wed, 20 Jun 2012 18:43:10 -0700

On 6/20/2012 3:22 PM, Karl Williamson wrote:
> All current named sequences appear to be each a single grapheme. That
> seems like it should always be the case.

Possibly, but keep in mind that neither the Unicode Standard nor UAX #29
in particular
define what a "grapheme" is. UAX #29 specifies an algorithm for determining
boundaries between "grapheme clusters", but it can be tailored, and as a
result
what the "thing" is between such boundaries is a little fuzzy. And even
the default
for that algorithm can and does change.

Furthermore, I don't see any necessary correlation between what sequences
people might end up insisting on naming (for whatever reason) and what
people might consider to be "graphemes". There could be a valid reason
somebody might want or need to name some sequence that clearly wouldn't
constitute a grapheme. Who can predict?

> If I'm right, should UAX #34 say this.
>
>

That seems like a straitjacket looking for an unwilling wearer. ;-)

--Ken
Received on Wed Jun 20 2012 - 20:47:42 CDT

This archive was generated by hypermail 2.2.0 : Wed Jun 20 2012 - 20:47:44 CDT