Re: mixed-script writing systems

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Nov 15 2002 - 16:18:58 EST

  • Next message: John Cowan: "Re: mixed-script writing systems"

    > So, the question is this: Should we say that this writing system is
    > completely Latin (keeping the norm that orthographic writing systems use a
    > single script) and apply the principle of unification -- across languages
    > but not across scripts -- to imply that we need to encode new characters,
    > Latin delta, Latin theta and Latin yeru? Or, do we say that this writing
    > system is only *mostly* Latin-based, and that it mixes in a few characters
    > from other scripts?

    If everyone can hold off on the Kurdish rhetoric for the moment,
    it should be clear that such mixed orthographies as Peter has
    shown in Wakhi are best handled by simply using the characters
    that are already encoded, rather than cloning more and more
    characters into Latin, Greek, and Cyrillic to deal with the
    artificial constraint that would claim that any LGC-based
    alphabet *must* consist only of a single script. In point of fact,
    people for centuries have been borrowing back and forth between
    Latin, Greek, and Cyrillic in particular, so that in some respects
    LGC is a kind of metascript and should be treated as such.

    Note that we will run across many other examples of such cross-script
    LGC letter borrowings in various oddball orthographies. One I
    happen to know about is the publication by Morris Swadesh of
    extensive texts of Wakashan languages using Cyrillic che (U+0447)
    in the midst of otherwise Latin letters for what most Americanists
    would currently use Latin c-hacek (U+010D) instead.

    It isn't doing anyone any favors to keep cloning such cross-script
    borrowings into the character encoding standard, *unless* there
    is strong evidence of script-specific adaptation of the letters
    after their borrowing. The handling of Latin Q in the otherwise
    Cyrillic Kurdish alphabet is what makes it the marginal case it
    is and argues for encoding of a separate Cyrillic Q. I do not,
    however, believe that such arguments apply to cases such as
    this Wakhi instance, unless Peter or someone else could demonstrate
    specific "Latin-scriptfication" of the borrowed letters in the
    orthography.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Nov 15 2002 - 17:05:13 EST