Twentieth International Unicode Conference

Unicode Forms: Too Many or Not Enough?

Sandra Martin O'Donnell - Compaq Computer Corporation

Intended Audience:	Software Engineers, Content Developers, Font Designers
Session Level:	Beginner, Intermediate

Although Unicode began as a single character set with a single, 16-bit encoded form, the past 10 years has seen the addition of multiple UTFs, as well as requests for more forms, tweaks, and variations every year. The challenge is to balance specific needs against the overall complexity any change adds to the standard. This paper discusses existing and requested forms, along with their impact. It covers:

forms that now are included in the standard (UTF-8, -16, and -32), and their primary use throughout the computing world,
forms or attributes that are being requested (new UTFs, Variant Selector, character clones, etc.) along with pros and cons, and
an assessment of the interoperability issues and overall complexity associated with the forms, including recommended actions.

When the world wants to talk, it speaks Unicode

International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

9 November 2001, Webmaster