RE: Unicode Cyrillic GHE DE PE TE in Serbian

From: Janko Stamenovic (janko@teletrader.com)
Date: Wed Jan 05 2000 - 05:40:14 EST


OK, the discussion went on to talk only about Russian.

Now I'd have to explain you what is issue in Serbian Cyrillic.

In Serbian written Cyrillic small t is written like two latin uu characters
glued together (only one line in the middle) over which the top stroke
exists. However this form never appeared in printed italic. But another
problem for Serbians is that, contrary to Russian, Bulgarian or Macedonian,
they actually very much use both Latin AND Cyrillic for their language.
Since there are letters that look exactly the same in Latin and Cyrillic,
for Serbians presence of more LATIN shapes in words can make them believe
that they are reading Latin words, and not Cyrillic!

Since both Latin and Cyrillic is used, Serbians always first "look" the
whole word, determine which alphabet is used, and only then they read it.
Sometimes they have to recognize the word, like: PECTOPAH is Serbian word
for Restaurant written in Cyrillic only because "pectopah" (written with
small letters it is obvious that it is not Cyrillic word) does not exist in
the language.

So unfortunatelly we really dont see appearance of "Latin m" and "Latin n"
instead of "small Cyrillic t" and "small Cyrillic p" as "a small
typographical issue".

(Some might say probably that we would be better off using only Latin, but
we have real examples where Latin representation of words in which lj or nj
are two latin letters is not the same as letter "lj" or "nj". So in Cyrillic
is "more precise" and Latin is used as convenience but very, very much. For
example, I guess 99.9% of internet pages on Serbian use Latin and not
Cyrillic. Why? Well, you know that another Serbian will be able to read your
page on any ASCII computer (more or less). Unicode is still not so present
on typical computer.

At the end why I tried to propose differnt characters: because I considered
that the problem will be "solved" easier. Now it appears that before
browsers/operating system and god knows what all do not become comliant with
/locl/ tag, and before /locl/ tag in fonts is not comming to use, and before
all texts wherever they appear in software are not considered only by
unicode code but also *by language tags* which are not standardized at all,
we will not be able to have proper italic. :((

As a curiosity which you might not know:

Serbians are not even happy with the notion that Cyrillic A and Latin A are
different characters -- in our language they aren't. For us, what Latin
writing people would call ligatures e.g. lj, nj can be considered as real
LETTERS which are just latin representation of our Cyrillic characters which
unicode calls CYRILLIC LJE and CYRILLIC NJE.

So from my perspective Unicode is not "natural" enough. Other people here
will maybe not agree, but I see in it some "godzilla ASCII" where what you
see as "glyph variants" or "typografical issues" already got their place
(what's better than lingatures like fi, ff, ffi etc which defintelly are not
letters in any language) as far as I have been explained "for historical
reasons".

Sorry for longer letter, I hope it will be informative for people who are
interested in the subject.

Janko



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT