From: Karl Pentzlin (firstname.lastname@example.org)
Date: Wed Nov 10 2010 - 10:08:25 CST
As shown in N3916: http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3916.pdf
= L2/10-356, there exists a Latin letter which resembles the Cyrillic
soft sign Ь/ь (U+042C/U+044C). This letter is part of the Jaꞑalif
variant of the alphabet, which was used for several languages in the
former Soviet Union (e.g. Tatar), and was developed in parallel to the
alphabet nowadays in use for Turk and Azerbaijan, see:
In fact, it was proposed on this base, being the only Jaꞑalif letter
missing so far, since the ꞑ (occurring in the alphabet name itself)
was introduced with Unicode 6.0.
The letter is no soft sign; it is the exact Tatar equivalent of the
Turkish dotless i, thus it has a similar use as the Cyrillic yeru
In this function, it is a part of the adaptation of the Latin alphabet
for a lot of non-Russian languages in the Soviet Union in the 1920s,
see e.g.: Юшманов, Н. В.: Определитель Языков. Москва/Ленинград 1941,
(A proposal regarding this subject is expected for 2011.)
Thus, it shares with the Cyrillic soft sign its form and partly the
geographical area of its use, but in no case its meaning. Similar can
be said e.g. for P/p (U+0050/U+0070, Latin letter P) and Р/р
(U+0420/U+0440, Cyrillic letter ER).
According to the pre-preliminary minutes of UTC #125 (L2/10-415),
the UTC has not accepted the Latin Ь/ь.
It is an established practice for the European alphabetic scripts to
encode a new letter only if it has a different shape (in at least one
of the capital and small forms) regarding to all already encoded
letter of the same script. The Y/y is well known to denote completely
different pronunciations, used as consonant as well as vocal, even within
the same language. Thus, if somebody unearths a Latin letter E/e in some
obscure minority language which has no E-like vocal, to denote a M-like
sound and in fact to be collated after the M in the local alphabet, this
will probably not lead to a new encoding.
But, Latin and Cyrillic are different scripts (the question in the "Re"
of this mail is rhetorical, of course).
Admittedly, there also is a precedence for using Cyrillic letters in
Latin text: the use of U+0417/U+0437 and U+0427/U+0447 for tone
letters in Zhuang. However, the orthography using them was
short-lived, being superseded by another Latin orthography which uses
genuine Latin letters as tone marks (J/j and X/x, in this case).
On the other hand, Jaꞑalif and the other Latin alphabets which use Ь/ь
did not lose the Ь/ь by an improvement of the orthography, but were
completely deprecated by an ukase of Stalin. Thus, they continue to be
"the" Latin alphabets of the respective languages.
Whether formally requesting a revival or not, they are regarded as valid
by the members of the cultural group (even if only to access their cultural
Especially, it cannot be excluded that persons want to create Latin domain
names or e-mail addresses without being accused for script mixing.
Taking this into account, not mentioning the technical problems
regarding collation etc. and the typographical issues when it comes to
subtle differences between Latin and Cyrillic in high quality
typography, it is really hard to understand why the UTC refuses to encode
the Latin Ь/ь.
A quick glance at the Юшманов table mentioned above proves that there
is absolutely no request to "duplicate the whole Cyrillic alphabet in
Latin", as someone may have feared.
- Karl Pentzlin
This archive was generated by hypermail 2.1.5 : Wed Nov 10 2010 - 10:14:00 CST