Re: "A Programmer's Introduction to Unicode"

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Sun, 12 Mar 2017 20:10:22 +0000

On Sun, 12 Mar 2017 20:02:28 +0100
"Janusz S. Bien" <jsbien_at_mimuw.edu.pl> wrote:

> If the basic notion has to be referred in a cumbersome way as
> "extended grapheme cluster" then it is easier to talk about "Unicode
> characters" despite the fact that they have a rather loose relation
> to real-life/user-perceived characters.

The notion that extended grapheme clusters corresponds to
user-perceived characters is also rather dodgy. Whereas it may work
for French, it is getting very dubious by the time one adds Hebrew
cantillation marks or Vedic accentuation. The Thais revolted when
their preposed vowels were joined with the following consonant in the
same extended grapheme cluster, and Unicode had to revoke that union.

Richard.
Received on Sun Mar 12 2017 - 15:11:02 CDT

This archive was generated by hypermail 2.2.0 : Sun Mar 12 2017 - 15:11:02 CDT