RE: Are Latin and Cyrillic essentially the same script?

From: Peter Constable (petercon@microsoft.com)
Date: Fri Nov 19 2010 - 01:15:56 CST

  • Next message: Asmus Freytag: "Re: Are Latin and Cyrillic essentially the same script?"

    If you'd like a precedent, here's one: IPA is a widely-used system of transcription based primarily on the Latin script. In comparison to the Janalif orthography in question, there is far more existing data. Also, whereas that Janalif orthography is no longer in active use--hence there are not new texts to be represented (there are at best only new citations of existing texts), IPA is as a writing system in active use with new texts being created daily; thus, the body of digitized data for IPA is growing much more that is data in the Janalif orthography. And while IPA is primarily based on Latin script, not all of its characters are Latin characters: bilabial and interdental fricative phonemes are represented using Greek letters beta and theta.

    Given a precedent of a widely-used Latin writing system for which it is considered adequate to have characters of central importance represented using letters from a different script, Greek, it would seem reasonable if someone made the case that it's adequate to represent an historic Latin orthography using Cyrillic soft sign.

    Peter

    -----Original Message-----
    From: Asmus Freytag [mailto:asmusf@ix.netcom.com]
    Sent: Thursday, November 18, 2010 11:05 AM
    To: Peter Constable
    Cc: André Szabolcs Szelp; Karl Pentzlin; unicode@unicode.org; Ilya Yevlampiev
    Subject: Re: Are Latin and Cyrillic essentially the same script?

    On 11/18/2010 8:04 AM, Peter Constable wrote:
    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    > On Behalf Of André Szabolcs Szelp
    >
    >> AFAIR the reservations of WG2 concerning the encoding of Jangalif
    >> Latin Ь/ь as a new character were not in view of Cyrillic Ь/ь, but
    >> rather in view of its potential identity with the tone sign mentioned
    >> by you as well. It is a Latin letter adapted from the Cyrillic soft
    >> sign,
    > There's another possible point of view: that it's a Cyrillic character that, for a short period, people tried using as a Latin character but that never stuck, and that it's completely adequate to represent Janalif text in that orthography using the Cyrillic soft sign.
    >
    >

    When one language borrows a word from another, there are several stages of "foreignness", ranging from treating the foreign word as a short quotation in the original language to treating it as essentially fully native.

    Now words are very complex in behavior and usage compared to characters.
    You can check for pronunciation, spelling and adaptation to the host grammar to check which stage of adaptation a word has reached.

    When a script borrows a letter from another, you are essentially limited in what evidence you can use to document objectively whether the borrowing has crossed over the script boundary and the character has become "native".

    With typographically closely related scripts, getting tell-tale typographical evidence is very difficult. After all, these scripts started out from the same root.

    So, you need some other criteria.

    You could individually compare orthographies and decide which ones are "important" enough (or "established" enough) to warrant support. Or you could try to distinguish between orthographies for general use withing the given language, vs. other systems of writing (transcriptions, say).

    But whatever you do, you should be consistent and take account of existing precedent.

    There are a number of characters encoded as nominally "Latin" in Unicode that are borrowings from other scripts, usually Greek.

    A discussion of the current issue should include explicit explanation of why these precedents apply or do not apply, and, in the latter case, why some precedents may be regarded as examples of past mistakes.

    By explicitly analyzing existing precedents, it should be possible to avoid the impression that the current discussion is focused on the relative merits of a particular orthography based on personal and possibly arbitrary opinions by the work group experts.

    If it can be shown that all other cases where such borrowings were accepted into Unicode are based on orthographies that are more permanent, more widespread or both, or where other technical or typographical reasons prevailed that are absent here, then it would make any decision on the current request seem a lot less arbitrary.

    I don't know where the right answer lies in the case of Janalif, or which point of view, in Peter's phrasing, would make the most sense, but having this discussion without clear understanding of the precedents will lead to inconsistent encoding.

    A./



    This archive was generated by hypermail 2.1.5 : Fri Nov 19 2010 - 01:21:43 CST