Re: Character set cluelessness

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Tue, 2 Oct 2012 23:09:38 +0100

On Tue, 02 Oct 2012 09:14:08 -0700
"Doug Ewell" <doug_at_ewellic.org> wrote:

> It's 2012. How does one get through to folks like this?

Even people who should know better can get confused about character
sets. Does anyone know what 'a complex script Unicode range' is? It's
a term that occurs in the Office Open XML specification, but I
can't find a definition for it.

It's just possible that it means a range where hypothetically unassigned
characters would not be left-to-right, but I've a feeling it ought to
include Vietnamese characters for all that they're Latin script.

Possibly the definitions have not been provided because the concept
ought to involve the tricky task of breaking text runs into script
runs. (Lots of people feel one should be able to add script-specific
combining marks to U+25CC DOTTED CIRCLE, U+2013 EN DASH and U+00D7
MULTIPLICATION SIGN or perhaps even U+0078 LATIN SMALL LETTER X.
U+0964 DEVANAGARI DANDA is used with the Latin, Devanagari and Tamil
scripts, to name but a few.)

Richard.
Received on Tue Oct 02 2012 - 17:10:51 CDT

This archive was generated by hypermail 2.2.0 : Tue Oct 02 2012 - 17:10:51 CDT