Re: Writing Tatar using the Latin script; new characters to encode?

From: Peter Kirk (peterkirk@qaya.org)
Date: Sun Jul 18 2004 - 17:39:47 CDT

  • Next message: Peter Kirk: "Re: Folding algorithm and canonical equivalence"

    On 18/07/2004 16:22, Alexander Savenkov wrote:

    >Hello,
    >
    >(delayed response)
    >
    >2004-05-12T19:37:51+03:00 Ernest Cline <ernestcline@mindspring.com> wrote:
    >
    >
    >
    >> From: Alexander Savenkov <savenkov@xmlhack.ru>
    >>
    >>
    >>>2004-05-12T03:08:59+03:00 Eric Muller <emuller@adobe.com> wrote:
    >>>
    >>>
    >>>
    >>>>According to <www.eki.ee>, there is a currently an effort to convert the
    >>>>writing of Tatar from Cyrillic to Latin.
    >>>>
    >>>>
    >>>>1. Does somebody have more information about that effort?
    >>>>
    >>>>
    >>>Perhaps it's their own effort.
    >>>
    >>>
    >>>
    >>>>Eki lists four characters as needed but missing in Unicode (see
    >>>>
    >>>>
    >>>>
    >>><http://www.eki.ee/letter/chardata.cgi?lang=tt+Tatar&script=latin
    >>>
    >>>
    >>>
    >>>>2. The case pair for barred o is encoded (U+019F and U+0275), and it
    >>>>seems that their confusion comes from less-than-perfect but annotated
    >>>>name for U+019F, and from the usage remark "African". Can we
    >>>>authoritatively tell them that those two characters are the ones they
    >>>>want? Can we add a "Tatar" usage remark to both?
    >>>>
    >>>>
    >>>Is there a need for this? You don't want to tell everyone on the net
    >>>about his or her wrong assumptions. There's one sentence in the page
    >>>you mentioned that gives a good description of this resource:
    >>>
    >>>"The conversion from Cyrillic to Latin script is planned within years
    >>>2001-2011."
    >>>
    >>>This is false.
    >>>
    >>>
    >>>
    >>>>3. The case pair n with descender is definitely not encoded, and from my
    >>>>memory of the discussion of ghe with descender, we would want to encode
    >>>>them as separate characters (rather than with combining descenders on
    >>>>"n"). Is anybody working on that proposal?
    >>>>
    >>>>
    >>>There's no Latin Tatar script. It's the law. Full stop.
    >>>
    >>>It's the Institute of Estonian language. I hope they know more about
    >>>Estonian than about other languages and Unicode.
    >>>
    >>>
    >
    >
    >
    >>They are numerous sites on the web about the change from Cyrillic to
    >>Latin for Tatar that is planned for completion by 2011 by the Republic
    >>of Tatarstan (a part of the Russian Federation).
    >>
    >>
    >
    >Ernest, I fail to see how the fact that there are numerous sites about
    >Latin for Tatar proves it really exists. There are numerous sites
    >about Babylon 5 and Frankenstein. What are your thoughts about these?
    >
    >
    >
    >>There is legal wrangling
    >>over wether Tatarstan can make the change back to Latin script official
    >>for Tatar as it is used there, but no final decision has been reached and
    >>there is probably at least several more years of legal shenanigans
    >>before it is reached.
    >>
    >>
    >
    >You're wrong and the facts you give here are outdated. Legal wrangling
    >is over. See links below (in Russian).
    >
    >...
    >
    >
    >
    >>As for the merits of the proposed change back to Latin, I think
    >>it is silly for Tatarstan to make the change and it is silly for the
    >>Russian Federation to oppose it.
    >>
    >>
    >
    >Your clever thoughts are really helpful. I wonder what Russians and
    >Tatars would do without them.
    >
    >Links in Russian:
    >http://www.tatar.ru/?DNSID=0627096ec5c075004c0d219207f349de&node_id=978
    >
    >

    An article about language on the official Tatarstan government website.
    Last paragraph:

    > В целях дальнейшего совершенствования татарского алфавита на основе
    > латинской графики и создание благоприятных условий для его вхождения в
    > систему мировой коммуникации 15 сентября 1999 года принят Закон
    > Республики Татарстан "О восстановлении татарского алфавита на основе
    > латинской графики".

    With the aim of the further improvement of the Tatar alphabet on the
    basis of Latin graphics and the formation of favourable conditions for
    its entry into the system of world communications, on 15th September
    1999 there was accepted a Law of the Republic of Tatarstan "On the
    establishment of the Tatar alphabet on the basis of Latin graphics".

    >http://www.tatar.ru/00001296_c.html
    >
    >

    This is the text of that law. Article 5:

    > Настоящий Закон вступает в силу с 1 сентября 2001 года.

    This law will come into force on 1st September 2001.

    >http://www.tatar.ru/index.php?node_id=1006
    >
    >

    The alphabet, with pronunciations in Latin and Cyrillic. This alphabet
    consists of Latin-1 characters plus schwa, G breve, dotless i, dotted I,
    N with a descender, barred O (019F/0275) and S with cedilla. The N/n
    with descender might cause a problem because 014A/014B do not look quite
    right. (But a rather different form of the same alphabet appears in the
    left column of http://www.tatar.ru/?node_id=2611; here the N with
    descender looks like 014A/014B with the alternative form of the capital
    looking like the small letter.) The case mapping of the i's is as in
    Turkish.

    >http://www.tatar.ru/?DNSID=0627096ec5c075004c0d219207f349de&node_id=2610
    >
    >

    This describes Inalif, an experimental Tatar Latin alphabet for use on
    the Internet, based on the alphabet in the 1999 law.

    >http://www.tatar.ru/?node_id=2611
    >
    >

    This gives details of Inalif, which appears to be ASCII-only and rather
    like the Uzbek Latin alphabet. This page dated December 2003 refers to
    the 1999 alphabet as "the official Latin alphabet", and is signed by
    many prominent Tatars at least one of whom is a top Tatarstan government
    official. It also mentions that the 1999 alphabet is used in some
    electronic newspapers and official websites.

    >http://peoples.org.ru/proekt.html
    >
    >

    A Russian federal law, undated, which claims that all state languages of
    the Russian Federation must use Cyrillic script.

    >http://peoples.org.ru/stenogramma.html
    >
    >

    A discussion of this law dated 2002.

    >Alexander.
    >
    >

    Conclusion:

    1) The Republic of Tatarstan passed a law in 1999 and coming into force
    in 2001 establishing a Tatar Latin alphabet.

    2) A Russian federal law (a monstrous piece of linguistic imperialism)
    overrode this in 2002, so after the Tatarstan law had come into force.
    Therefore this Latin alphabet was in some sense officially in force for
    a period. And it is still considered to be officially in force by many
    in Tatarstan including top government officials.

    3) As the people of Tatarstan are independent-minded and more likely to
    follow their local leaders than the linguistic imperialists in Moscow,
    it is highly likely that at least some of them use the published Latin
    script even if it is not permitted to have official status.

    4) Not all speakers of the Tatar language live in the Russian
    Federation, and some live in countries like Azerbaijan where the
    official alphabets use Latin script. In such areas they are clearly
    likely to use the Latin script.

    5) This is an alphabet which has been used, even in official websites,
    and very likely continues to be used by some. Decisions made in Moscow
    do not change this, especially because they are in practice widely
    ignored in Tatarstan and have no force in some other places where Tatars
    live. This alphabet therefore needs to be supported by Unicode. But
    fortunately this is not a problem as all the characters are already defined.

    The only remaining issue is that this is another alphabet which uses the
    special Turkish case pairing for I; there are in fact several such
    alphabets in use other than Turkish and Azerbaijani. By the way, the
    government of Azerbaijan has recently officially stated that its state
    language should be known in English as Azerbaijani, not Azeri. Is there
    still a place in the standard where these two languages are named as
    having special case mapping? If so, I can't find it, but it needs to be
    reviewed.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Sun Jul 18 2004 - 17:41:50 CDT