Two New Coded Character Standards from China and their Implementation: HK SCS & GB 18030
Dirk Meyer - Adobe Systems, Inc.
The purpose of this presentation is to describe two Chinese "coded character sets" that have been published in August 1999 and in March 2000, respectively, as well as problems related to their implementation. These two standards are the Hong Kong Supplementary Character Set and the Chinese national standard GB 18030-2000.
The paper will provide detailed information about the history and the contents of both standards:
For developers of applications and fonts who intend to support the languages and scripts involved ("simplified" and "traditional" Chinese), the two standards provide specific challenges, especially when the working environment is partly of fully Unicode-based. In such an environment complete support for the HK SCS, for example, can only be achieved through effective use of the Private Use Area. Fonts and applications that fulfill this task will be introduced and described in form of a case study (Adobe Acrobat 5.0 and the underlying "substitution fonts").
When it comes to GB 18030-2000, it is of crucial importance that mapping mechanisms between the original encoding and Unicode are being adopted efficiently. Currently open questions about the coverage of GB 18030 (which may very well have been resolved at the time of the presentation) will be mentioned. Again, Acrobat 5.0 and its font machinery will be used to illustrate a possible approach how to handle the challenges here.
Although covering different character repertoires and based on different encodings, the new Chinese standards are similar in that they both provide clearly defined conduits to Unicode and in that they both contain unique challenges when it comes to their implementation in a Unicode environment.
With regard to the standard HK SCS, not all of its characters are available in Unicode 3.0. Support for this character set in Unicode-based applications can be achieved only with the help of the Private Use Area.
GB 18030-2000, on the other hand, presents a challenge in that it applies a four-byte encoding scheme to map the complete character repertoire of Unicode 3.0 into its own encoding space in order to remain compatible with pre-existing national Chinese standards, namely the specification GBK.
While an obvious tendency can be noticed to support Unicode as much as possible within the framework of these relatively new standards, it remains the task of applications or fonts to create an environment that seamlessly integrates these standards' character repertoires into a Unicode-based environment.
|When the world wants to talk, it speaks Unicode|
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.
13 December 2000, Webmaster