Lakota reprise: (Re)birth of a character

From:
Date:

On a couple occasions the issue of Unicode coverage of
the Lakota orthography has come up on this list. I finally
tracked down enough source material to identify the problem.

The issue for Lakota in Unicode is the representation of
the Lakota nasal vowels in the 1982 Lakota orthography. That
orthography was developed by Lakota educators, was adopted
by the South Dakota Association of Bilingual and Bicultural
Education, and is being used to print books, dictionaries,
and teaching materials for Lakota.

There are a number of encoding issues for the 1982 Lakota
orthography in Unicode, because of the nature of the diacritic
usage that was chosen. That diacritic usage departs from
Americanist conventions to meet a number of criteria, including
familiarity from older usage, aesthetics, and some other
intangible factors.

In particular, to represent the 1982 Lakota orthography in
Unicode, you must make use of Latin letters plus the following
characters as diacritics:


       indicates aspiration on surds (p, t, c, k); modified point
       of articulation on fricatives (s, h); modified manner
       of articulation on g [g-dot-above = voiced velar fricative].


       indicates voicelessness on surds (p, t, c, k).


       indicates ejective release on surds (p, t, c, k); post-glottalic
       release on fricatives.

The latter usage is derivative from the use in the Buechel 1939
grammar of the (typewriter) apostrophe (i.e. U+0027) for the
same function. And that, in turn, is related to the Americanist
usage of U+02BC MODIFIER LETTER APOSTROPHE to indicate ejective
or glottal release. This means there is probably going to be some
ambiguity in the representation of Lakota, since people are going
to be uncertain as to whether U+02B9, U+02BC, or U+0027 should be
used. The fonts used with the current printed material clearly show
a prime mark, rather than a raised comma or a directionally neutral
apostrophe, but Lakota linguists and educators will presumably need
to decide this one.

The real issue is for the mark used to indicate nasalization of vowels.

Lakota has three nasal vowels, a nasalized form of /i/, /a/, and of /u/.
The 1982 orthography indicates these with digraphs, where the second
element is basically an n with a long right leg. Earlier discussion
of this had pointed to Unicode U+019E LATIN SMALL LETTER N WITH LONG RIGHT LEG
as this character. But that character has no associated uppercase character,
which is needed for the Lakota orthography.

The issue is complex, however. It is clear that this Lakota letter
is a new creation. If you go back to the source of this element of
the orthography, you can find it in Buechel, 1939, A Grammar of Lakota,
which represents the vowels this way, but using what is clearly a
lowercase Greek letter eta (i.e. U+03B7). This, in turn, derived from
a 19th century Dakota alphabet created by Episcopal missionaries and
associated particularly with the name of Stephen R. Riggs. The Greek
letter eta was often a printing substitution for eng (i.e. U+014B),
to indicate nasalization. So we have a complicated confusion here of
three letterforms.

U+019E was proposed in the IPA Principles (1949) for use in digraphic
spellings of nasal vowels -- presumably as a way of regularizing the
eta/eng confusion. But the letter was withdrawn from the IPA in 1976.

However, presumably because of the enormous impact of the missionary
orthography on the history of the written Lakota language, the
digraphic spelling of nasal vowels was preferred by the Lakota
educators when deciding on the 1982 orthography, over the general
Siouan linguistic tradition of writing nasal vowels with ogoneks.
Effectively, this meant a resurrection of the n-with-long-right-leg,
since the orthography was intended to be Latin, not Latin with one
Greek letter eta.

The practical orthographies used in the missionary dictionaries and
grammars, and technical linguistic orthography of Boas and Deloria
never had to decide on the problem of how to uppercase the nasal
vowel, since as a digraphic representation, the nasal indicator never
occurs initially, and those sources don't use all-cap text anywhere.
But the 1982 orthography is intended for general use-- and that means
that the Lakota text can also occur in all-cap environments such
as chapter headers, and so on.

So as in the case of African languages that adopted an IPA-based
orthography, and then created uppercase versions of letters that
had no uppercase in IPA (cf. U+0186, U+018F, U+01A9, for example),
we have another instance here of orthographic usage driving the
need for a new uppercase character: LATIN CAPITAL LETTER N WITH LONG RIGHT


