Re: Amerindian Characters

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jun 16 1999 - 14:29:59 EDT


Michael Bauer suggested adding a number of additional characters
for Amerindian writing systems (a with cedilla, a with cedilla and acute,
e with cedilla [which *is* in Version 3.0, by the way], and so on).
[By the way, the Lakhota (Siouan), Tuscarora (Iroquoian), and
Navajo (Athapaskan) nasal vowels are are probably more correctly
represented with a ogonek than a cedilla, anyway.]

The problem, of course, is that there is no end to this. Many,
many combinations of Latin base letters plus combining diacritic
marks have been used in the orthographies of indigenous languages
of the Americas. And the orthographies are all over the map--literally
and figuratively--ranging from highly technical adaptations of
IPA or Americanist phonetic practice, to practical orthographies
using digraphs and trigraphs more appropriate to typewriters or
other available, but font-poor, rendering technologies. And also
ranging from orthographies that have more or less official status
backed by a nation or tribe to orthographies developed ad hoc
for individual publications by experts.

Just one example of an obvious omission from Michael's list that
is nevertheless in extremely widespread use in the Americas:
letters with comma above to indicate ejective and/or "glottalized"
(glottal coarticulation) consonants. These combinations can for
certain be found in the literature for at least:
b, c, c-hacek, d, j, k, l, m, n, p, q, t, s, s-hacek, w, x, y,
barred-l, barred-lambda, plus for labialized or palatalized versions
of many of the same letters, and occasionally for letters
otherwise marked with another diacritic for modified articulation
(e.g. t-dental, t-alveolar, t-retroflex + comma above, etc.).
And even, for Nootka: middle-dot + comma-above!

Rather than start down the same, sad, political engineering path
pioneered by all the arguments over precomposed characters needed for the
official alphabets of European languages, it is far preferable to
recognize that representation of text for the Americas is best
done by making use of the combining marks that are *already*
present in the standard, rather than fighting for years over which
additional precomposed characters to add to the standard.

If you make use of the combining marks, *all* languages of
the Americas are already representable using the Unicode Standard.
If you refuse to use combining marks, it will be *decades*
before the surveying, collecting, arguing, and balloting is done,
and even then, you still risk the likelihood that commercial
applications in the future may abjure such marginal additions
anyway, leaving support for the additions in limbo anyway.

--Ken Whistler



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT