RE: Language Tagging And Unicode

From: Janko Stamenovic (
Date: Wed Jan 19 2000 - 07:37:39 EST

So if two Chinese "font variants" exist as different characters what more
can I say?

And people here complain about 5 new characters for difference between
Serbian and Russian?

Even bigger thing: A, E, O are the same in Cyrillic and Latin, yet they are
different characters in Unicode.

If we want to look at them that way, a lot of differences between Latin and
Cyrillic can be seen as "different glyph variants" for the same characters.
In Serbian this is more than true since there is correspondence 1-1 between
Latin and Cyrillic text!

Searching arguments like "now you can search for the Russian and Serbian
words which are written the same" fail in my language -- now we exactly
can't search for the word but for "Latin representation of the word" and
"Cyrillic representation of the word".

Again I don't see then why Serbian t must be the same character as Russian t
in Unicode which already spent a lot of different characters for "the same

Can anybody now explain me the exact logic for "what is character and what's
not"? As far as I can see, the only real rule is "what people accept that it
should be a character".

So I as anybody here: who will lose what with accepting these characters? I
don't see any loss whatsoever?

> -----Original Message-----
> From: []
> Sent: Wednesday, January 19, 2000 11:04 AM
> To: Unicode List
> Subject: RE: Language Tagging And Unicode
> Richard Gillam wrote:
> >I have yet to hear a good reason why the Serbian/Russian problem is
> >anything more than a font-selection issue. It's the same problem
> >you have with Greek/Coptic, Arabic/Urdu/Persian, and Traditional
> >Chinese/Simplified Chinese/Japanese.
> I agree: Ol Korekt, apart one error, that should probably find
> its place in
> some FAQ:
> Traditional and simplified Chinese are not font variants of each
> other!
> These two national variants of ideographs (used in People's Republic and
> Taiwan, respectively) have actually been encoded as different characters.
> This is because of how the "CJK Unification" process has been
> defined. Maybe
> this is a blunder, maybe it would have been more correct, or
> fair, or useful
> that they were font variants: I don't know and I don't want to
> discuss this,
> but please just take notice of the fact.
> _Marco

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT