Re: Language Tagging And Unicode

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Jan 19 2000 - 10:07:08 EST


Janko Stamenovic wrote:

> And people lived exactly as you
> are talking now: special font for Serbian, special font for Russian. One
> person can print it, but give the document to another -- he's in trouble.
> The idea with Unicode is to get rid of such problems, I guess?

No. The idea with Unicode is to make *plain text* interchangeable across
the world's writing systems. The difference between roman and italic
is not visible in plain text. There is no implication that specialized
fonts are not necessary to render specific languages well.

> Now after all this letters I can better understand your arguments also. Am I
> right that the main problem with adding additional characters to Unicode
> would be that if we'd make character table showing the shape of letters we'd
> find that Russian P and Serbian P look the same (since the difference would
> not be visible without cursive print).

Not at all. Many Unicode characters look alike. The trouble is that
Serbian P and Russian P are the same letter, traditionally rendered in
good typography using different "looks", at least in certain font/face
combinations. Unicode has many violations of the "one character, one
codepoint" principle, but they are for backward compatibility with
existing character sets (not existing typographical traditions).
There is no such problem here (i.e. no character set that represents
Russian P and Cyrillic P with different codes). On the contrary, we
have the DOS and Windows codepages which purport to handle both
Russian and Serbian text, and there must be a single conversion table
for them, not language-specific conversion tables.

> But the same argument can be said to already existing characters in Unicode:
> Why is Latin A and Cyrillic A different? Do you see the difference in "plain
> text" look? I don't.

No. But they are not identified because they are different scripts.
(There are a few violations of this principle, notably the lack of proper
Cyrillic Q for Kurdish.)

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT