> This has always puzzled me, because Cyrillic includes lots of other
> characters that transliterate to two or more Latin letters. CH, SH, SHCH,
> and ZH leap to mind; there may be more. What was the thought process behind
> providing these compatibility characters only for the Serbo-Croatian
> additions to Cyrillic, but not for the other Cyrillic characters?
Because those Cyrillic letters (except SHCH, which is not used)
transliterate to single Latin letters in Serbo-Croat. (AFAIK, Croat is
hardly ever written in Cyrillic letters today; I don't know if Serbian
is often written in Latin letters or not).
Historically, the mapping between Cyrillic and Latin was so close that
a manuscript might be typed in Latin and submitted to a publisher who
set the resulting book in Cyrillic. Crossword puzzles placed "dz",
"lj", "nj", and "dz-caron" in a single square, so that solvers could use
either script. Every literate person could (probably still can) read
either script with essentially equal facility.
The Unicode situation is meant to perpetuate this sort of 1-1 lossless
transliteration, which is really not "transliteration" at all in the
sense of a Latin transliteration of Russian or a Cyrillic
transliteration of English, which involve varying amounts of
lossy conversion. Nor is it like Cyrillic Mongolian vs. Mongolian
Mongolian, where the mapping is not even computer-tractable, since
mn-mong represents a much older stage of the language.
> Of course, I am not at all suggesting that any such additional characters be
> added. The existing compatibility characters require three code points each
> (uppercase, titlecase, and lowercase) and I was under the impression that
> they were deprecated, though I could find no mention of that in TUS 3.0.
They have compatibility decompositions, which is one kind of
-- There is / one art || John Cowan <firstname.lastname@example.org> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT