RE: Unicode Cyrillic GHE DE PE TE in Serbian

Date: Tue Jan 18 2000 - 10:43:54 EST

Janko did a really great job pointing out this problem about Cyrillic

But his proposed solution is not as brilliant: the idea of adding this extra
characters is a very poor design. If it would be accepted (and it will not,
methinks) it would create problems much bigger than the one it tries to fix.

>As far as I can see it can definitely be one very easy solution: once they
>would appear in Unicode, every software which would support Cyrillic would
>handle them properly.

Having two different encodings for the *same* letters is not at all an easy
solution. It is the most complicated thing I could think of!

Imagine searching, for instance: all applications should be changed to know
that "Russian pe" and "Serbian pe" are to be considered the same, otherwise
Serbians would always have problems when searching Russian documents, and
vice versa.

Imagine case-conversions: Cyrillic uppercase PE would correspond to two
separate lowercase letters. When you uppercase text there is no problem; but
when you *lowercase* it, the software needs to know which lowercase to use.

Let's also consider display, that is Janko's main concern. With the current
solution, we have a problem: if Serbian text is displayed with a font
designed for Russian, some *italic* letters look very strange (and are
possibly unreadable for those who don't have familiarity with Russian). With
Janko's solution, we have a much bigger problem: if Serbian text is
displayed with a font designed for Russian, some letters (italic or not)
will simply display as black boxes.

>... the number of living Cyrillic languages is quite small.

Wrong: Cyrillic it is one of the 3 "jolly" alphabets (together with Latin
and Arabic) and it is used to write *hundreds* of languages in the former
USSR and elsewhere. Let's not be Eurocentric considering only superstar
languages like Russian, Serbocroatian, Bulgarian, etc.: there are a great
number of "minority" languages in Asia that are written in Cyrillic too.

>True, but still this is very straightforward solution for the named
>since it does not require any new software concepts like "rendering based
>language tagging" which requires a lot of changes in many levels of the
>software: since I'm Windows programmer, I know that such a request would
>mean that API would have to be extended to pass the *language* information
>to the font creation engine. This means that Microsoft would have to change
>the API (very much!), MFC and all their applications etc. and all this just
>to make possible to display five Serbian letters properly? I don't expect
>this even in next ten years!

I do not expect this in the next 1000 years. Language tagging or other
similar complicated things will probably always be mostly for fine word

For many other easier things, it is enough to use one of the two simpler
solutions that have already been mentioned:

1) Use language-specific fonts, tailored for Serbian *xor* Russian;

2) Use compromise fonts for Serbian *and* Russian (using "sloped" or no

>Now somebody would say that they'd do this because of Japanese or Chinese
>market -- but I don't think so --
>as far as I can see there are different cultural preferences to the look
>of the same characters in Japan and in China, but I don't expect that
>either Chinese or Japanese people will insist that software becomes
>changed so that BOTH variants are visible in one document.

Why not!? The Japanese/Chinese case is exactly the same as the
Serbian/Russian case.

Japanese people want to use Chinese in their documents (and vice versa) just
like Serbians want to use Russian in theirs (and vice versa). Everybody in
the world may need to use foreign languages, especially neighboring

Chinese displayed with Japanese-specific fonts (or vice versa) looks
"questionable" or even "unreadable" just as it happens for Cyrillic italics.

>And even so, information "China or Japan" can be squeezed in "charset"
>field -- but there is not space to squeeze "Serbian" or "Russian"
>in it.
>Contrary to that, having Serbian and Russian text in the same document is
>quite a small goal which should be handled gracefully.

Janko, I don't follow your reasoning here. I thought we were talking about
one single charset: Unicode. And I still don't see the difference between
Chinese/Japanese and Serbian/Russian

>Can you give the example which applications "do not require high-quality
>Cyrillic typography" at the days when nobody buys for printing anything
>is not capable of at least 600 dpi? We are not talking about character
>terminals any more.

Well, do you mean that a 16-pixel crystal-liquid display mounted on a
refrigerator (or on a cellular phone) requires high typography?

But it requires internationalization anyway, because people in Japan, Serbia
or India have all rights to use appliances [optionally end sentence here]
that "talk" in their languages.

>What we are interested in are "common" applications like poor Microsoft

If MS Word is poor, what is MS Notepad? And what is the refrigerator display

>It is far from "professional" typesetting engine, but even such
>should offer something decent. And anybody from the people who
>here "do fonts for living" would tell you that using "sloped" letterforms
>for Times or any Serif is more than unacceptable.

But I would like someone of them to admit that "sloped" letterforms for
Helvetica or most sans-serifs fonts *is* more than unacceptable.

What is not acceptable, IMHO, is the inconsistent choices that you can see,
e.g., in MS Arial (not Arial Unicode: the older one I mean). The italics for
Cyrillic letters that look like Latin letters (e.g. a) are simply "sloped",
but the italics for letters that do *not* look like Latin letters (e.g. pe)
mimic the Russian italics that normally belong to serifs fonts.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT