Re: Language Tagging And Unicode

From: Robert A. Rosenberg (
Date: Wed Jan 19 2000 - 14:43:49 EST

At 01:50 PM 01/18/2000 -0800, John Cowan wrote:
>"Robert A. Rosenberg" wrote:
> > I agree. You assign separate codepoints to the 5 characters.
>And what about automatic conversion from existing Cyrillic
>character sets? Gone, all gone!

I assume that you are talking about files that are in Serbian as opposed to
some other Cyrillic Language. If you know that it is Serbian just use a
translation/conversion table that maps the 5 characters (in both UC and LC)
in question to the Serbian in lieu of the Russian/etc Unicode codepoints.
Since you say "automatic" there must be tables so using one as opposed to
the other should not be that hard. If you accidentally goof and use the
wrong one just do another translate that maps each character to itself
except for the 10 codepoints.

>Schlingt dreifach einen Kreis vom dies! || John Cowan
>Schliesst euer Aug vor heiliger Schau, ||
>Denn er genoss vom Honig-Tau, ||
>Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT