Unicode Technical Note #4

Leaks in the Unicode Pipeline: Script, Script, Script…

Version	1
Authors	Michael Everson
Date	2002-05
This Version	http://www.unicode.org/notes/tn4/tn4-1.html
Previous Version	none
Latest Version	http://www.unicode.org/notes/tn4

Summary

This paper by Michael Everson was presented at the 21st International Unicode Conference in May 2002. It observes that an examination of the Roadmap shows that there are at present no less than 92 scripts yet to be encoded! These scripts range from large, complex and famous dead scripts like Egyptian hieroglyphs, to small, little-known but simple scripts like Old Permic. But, importantly, about a third of the scripts are living scripts which are intended to go on the BMP. Over the past few years, some implementers and standardizers alike have expressed their concern about how much work remains to be done. "When will the standard be finished?" they have asked. This Technical Note gives a brief overview of the history of Unicode allocations, and discusses the standardization process required for newly-allocated scripts, including the kinds of procedural, political, and implementation issues which are met with in trying to get a script standardized. The different types of scripts remaining to be encoded are discussed with regard to the ease with which they can be both encoded and implemented. Finally, a proposal for the way forward is given.

This paper is provided for historical reference only. Parts of the strategy and ideas presented herein have been superceded by subsequent events, and no claims are made as to currency or applicability beyond its historical position as a seminal paper concerning issues of character encoding.

Status

This document is a Unicode Technical Note. It is supplied purely for informational purposes and publication does not imply any endorsement by the Unicode Consortium. For general information on Unicode Technical Notes, see http://www.unicode.org/notes.

The body of this note is contained in the file "Leaks in the Unicode Pipeline: Script, Script, Script...".

© 2002 Michael Everson. This publication is protected by copyright, and permission must be obtained from the author and Unicode, Inc. prior to any reproduction, modification, or other use not permitted by the Terms of Use.

Use of this publication is governed by the Unicode Terms of Use. The authors, contributors, and publishers have taken care in the preparation of this publication, but make no express or implied representation or warranty of any kind and assume no responsibility or liability for errors or omissions or for consequential or incidental damages that may arise therefrom. This publication is provided “AS-IS” without charge as a convenience to users.

Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the United States and other countries.

Unicode Technical Note #4

Leaks in the Unicode Pipeline: Script, Script, Script…

Summary

Status

Contents