Mark-Driven Script Categorisation (was: Compliant Tailoring of Normalisation for the Unicode Collation Algorithm)

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Thu, 17 May 2012 21:23:00 +0100

On Wed, 16 May 2012 21:46:17 -0700
Mark Davis ☕ <mark_at_macchiato.com> wrote:

> No, it's not.
>
> Including x in Lao for some pedagogical (I'm guessing) purpose is
> completely out of scope. That'd be like including π in Latin because
> it sometimes occurs in the middle of English text.

No, it's more like including Devanagari candrabindu in the Latin
script because it sometimes occurs on Latin letters in discussions of
Sanskrit. (Actually, I can only recall it on lower-case 'l'.) We
already have U+0310 COMBINING CANDRABINDU.

The problem is that 'x' then takes the full set of Lao vowel symbols,
forming a default grapheme cluster.

Richard.
Received on Thu May 17 2012 - 15:26:16 CDT

This archive was generated by hypermail 2.2.0 : Thu May 17 2012 - 15:26:24 CDT