UnicodeIUC18
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
Abstract

The Hithhiker's Guide to Chinese Encodings

Thomas Emerson - Basis Technology Corporation

Intended Audience: Software Engineer
Session Level: Intermediate

This paper presents an overview and analysis of the plethora of Chinese character encodings, describing their similarities and differences, and describing how they map to various versions of Unicode. For example, how does Big 5 compare with Big 5+ and Microsoft CP950? What about the various extensions to Big Five? How do the HKSCS, Eten and HKUST EUDC extensions to Big 5 compare and map to Unicode? How does one round-trip each of these? And then there is CNS-11643...

Unfortunately, dealing Simplified Chinese is no simpler: what is the relationship between GB 2312:80 and GB 12345:90 (GB 12345 is the traditional analog to GB 2312) and how does GB 12345 compare with Big 5? For that matter, how does GB 2312:80 compare with GBK and Microsoft CP936? And how do all of these map to Unicode 2.1 and 3.0.1? What do all these mean for the poor programmer who has to try and deal with them?

At the end of this presentation, you will leave with a better understanding of how these encodings relate and how to deal with them when authoring Chinese-language applications.


Unicode
When the world wants to talk, it speaks Unicode

UnicodeIUC18
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

10 December 2000, Webmaster