Unicode in Historical Perspective
Carter Weiss - Hokulewa Associates
While Unicode is a major technological turning point in text handling for computing and data communications systems, it is also part of a rich historical tradition of numerical representation of text. This paper reviews that history, illuminating the problems solved by Unicode, and issues that still remain.
The paper begins with a brief survey of earlier text encoding technologies like Morse code, punched cards, and teletype.
We then turn to the emergence of code page technology for computer systems in the 1950's. The major families of computer code pages are introduced, along with the standards and architectures used to design and classify them.
Along with the "Babel of code pages," legacy code page technology introduces two problems that were particularly difficult for implementation of database systems: a) the multiple code page problem, and b) the mixed script problem. The standard (generic) solution for these problems in the RDBMS (relational database management systems) industry is described and illustrated for actual applications.
Unicode, then, is the natural culmination and solution to these earlier difficulties. We discuss the origins of Unicode, the reasons behind its key design principles, and its emergence as the premier text technology for current and emerging computer and data communications technologies.
The paper concludes with a few words on Unicode's role in the World Wide Web, and emerging issues including Unicode's role in network security.
|When the world wants to talk, it speaks Unicode|
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.
22 May 2002, Webmaster