Re: Tagging orthographic systems (was: (iso639.186) the

From: Kenneth Whistler (
Date: Wed Sep 13 2000 - 19:26:50 EDT

John Hudson responded:

> Tom Emerson wrote:
> >My point is that for some languages there is no single orthography
> >that can ever be nailed down. It may depend on the individual
> >author. Outside of a constrained register corpus you these tags would
> >be less useful.
> Individual authors' idiosyncratic spellings are not a candidate for
> tagging; widely used and, in the case of German, for example, officially
> published orthographies are.

Au contraire. While your assessment may be essentially correct for
standardized languages, there are numerous instances where the
individual "authors'" idiosyncrasies are precisely what is at issue.

The obvious case that linguists are interested in is collections of
manuscript corpora of transcriptions of otherwise unwritten and
unstandardized languages. With no standardized system, each
corpus may follow its own conventions, and interpreting the data
depends on tracking which transcriber used what conventions.
And even for a single transcriber, the conventions they used may
change over time, so you have to "tag" information into personal
"eras" even for a single transcriber.

To give you one very evident example from North American practice:
There exists an immense collection or primary linguistic materials
generically called "Harringtonia", now housed mostly in the Anthropological
Archives of the Smithsonian Institution. J.P. Harrington was a
monomaniacal but extremely talented linguistic recorder who worked
on dozens of languages, primarily in California, but elsewhere on
occasion throughout North and Central America, extending over a
period from about 1906 to his death in the mid 1960's. He used
an idiosyncratic, but rather accurate, adaptation of IPA, and his
conventions of symbol usage changed somewhat over time. So for
Harrington's Chumash language recordings, you will find systematic
differences between how he transcribed it in 1909 and how he transcribed
it in 1959. And any of his transcriptions are orthographically distinct
from the conventions used by any other anthropologist or linguist
who dealt with the same language(s) before, during, or after the
time Harrington worked on it.

So there is a perfect case for the need for an idiosyncratic
orthographic tag -- in fact for several, changing over time --
for a single person's transcription of a single language.


