Re: Transliteration in Asia, was Re: Hausa: Boko<->Ajami?

From: Donald Z. Osborn (
Date: Tue Jul 06 2004 - 09:44:24 CDT

  • Next message: Kent Karlsson: "Re: Error in Hangul composition code"

    The situation described below for the Uzbek Latin alphabet seems to contrast
    with the approach taken for African language transcription since the 1960s.*
    The latter, although it has its irregularities and differences among countries
    & languages, is on the whole fairly consistent and intuitive. One of the
    explicit objects in a series of conferences on harmonization of transcription
    coordinated through UNESCO was to avoid the kinds of cumbersome and
    counterintuitive spelling combinations that many European languages had evolved
    to make the Latin alphabet (pre-ASCII?) conform to their needs. Even many/most
    of the extended characters used in many languages, particularly in West Africa,
    are not so strange in form as to defeat the unfamiliar reader.

    This legacy is helpful at least on a technical level (policy level is another
    matter) for efforts today for education in these languages. It may also
    facilitate operations such as transliteration, where necessary, and
    text-to-speech/speech-to-text conversions.

    *(More info, for anyone interested is at )

    Don Osborn

    Quoting Philipp Reichmuth <>:

    > Peter Kirk schrieb:
    > > (There are also languages written in Arabic and Indic scripts, but I
    > > don't know enough about these to be helpful.)
    > In principle, the situation with Arabic loanwords and the need to retain
    > the original Arabic spelling is the same there.
    > > Most of these conversions can be programmed easily, although there is
    > > a small problem with the new Uzbek Latin alphabet, deliberately
    > > based on ASCII only plus apostrophe serving as a diacritic, for sh,
    > > ch and gh are usually digraphs [...]
    > This is indeed a problem, even though only with "sh" and possibly "iy",
    > even though the latter appears only in word-final position. As far as I
    > know, "c" is not used in Uzbek except in the "ch" digraph, and the
    > apostrophed digraphs "o'" and "g'" are not really problematic in this
    > respect.
    > Nevertheless, it is an extremely awkward alphabet from a typographic
    > point of view, and also it's not exceedingly systematic to write "sh",
    > "ch" but "g'"). The new official Qaraqalpaq Latin alphabet proposed by
    > the Uzbek government is even weirder, it uses "a'", "i'", "u'" and "n'"
    > in addition to the Uzbek "g'" and "o'".
    > > Changing in and out of Arabic script is much more complicated. The
    > > main issue is that Arabic loan words (which are common in most of
    > > these languages) usually have to be spelled exactly as in Arabic
    > > (oddly, except for TEH MARBUTA which becomes either TEH or HEH) even
    > > though many of the distinctions are lost in pronunciation and
    > > therefore in Latin and Cyrillic script.
    > Even in Latin<->Cyrillic conversion, you have the same problems with
    > Russian loanwords (Uzbek "январь" > "yanvar" being a borderline case,
    > but what about "вулканизация" > "vulkanizasiya"?) or Western
    > proper names.
    > > As a very simple example, "Sudan" in Turkish or Azerbaijani can be
    > > the name of a country or it can mean "from water", and the correct
    > > Arabic spellings are likely to be very different, and can be
    > > disambiguated only by complete parsing of the context.
    > I think in this particular case the spellings would be the same, but the
    > point is valid nonetheless, of course.
    > Philipp
    > --
    > So wichtig wie die Braut zur Trauung
    > ist Bullrichsalz für die Verdauung!
    > - Bullrich-Salz, 1951

    This archive was generated by hypermail 2.1.5 : Tue Jul 06 2004 - 09:45:18 CDT