Unicode transliterations (and other operations)

From: Mark Davis (mark@macchiato.com)
Date: Mon Jul 02 2001 - 17:56:16 EDT


For those interested in Transliteration (and other Unicode transformations), there is a new ICU web demo program on

http://oss.software.ibm.com/developerworks/opensource/icu/translitdemo

These include basic script transliterations, such as

"Горбачев, Михаил" => "Gorbachèv, Mìkhaìl"

but also but also other operations. For example, you can feed in "국삼", and get out:

"{HANGUL SYLLABLE GUG}{HANGUL SYLLABLE SAM}" (with the Any-Name transliterator), or

"\uAD6D\uC0BC" (with the Unicode-Hex transliterator), or

Transliterators can be chained together, and can also be created at runtime with the demo, e.g. the simple rules:

j > i;
J > I;
u > v;
U > V;
w > vv;
W > VV;

when applied to "The quick brown fox jumped over the lazy dog." result in "The qvick brovvn fox ivmped over the lazy dog."

Transliterations will be enhanced significantly for the ICU 2.0 release, so this is a work in progress. Comments and feedback are welcome.

Mark



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 13:48:07 EDT