UnicodeIUC19
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
Abstract

Optimal Unicode 3.x Character Attributes and Access Methods

Ienup Sung - Sun Microsystems, Inc.

Intended Audience: Manager, Software Engineer, Systems Analyst
Session Level: Intermediate

The Unicode Version 3.1 defines and encodes total 94,140 characters which is about 8.45% of the possible maximum number of characters the Unicode can represent. These characters are populated over rather wide span of the Unicode coding space and especially concentrated in the Plane 0, 1, 2, and 14.

Each and every character defined in the Unicode has various character attribute values like character classes, collation weights, and so on. These attribute values are quite frequently used by application programs. Due to this reason, most underlying platform software provide such character attribute values to upper layers of software through a set of programming interfaces as a supported feature functionality.

Even though computer hardware and software system resources are getting more economic every day, it is still necessary for the platform software to make and provide such functionality to upper layers of software in such a manner that the functionality will use minimum system resources and yet will be fast enough so that users of the functionality can achieve the best possible performance.

The goal of this technical presentation is to present several generic and also Unicode 3.x-specific data structures and access methods and provide comprehensive analyses and comparisons among them in both theoretical and empirical manners. The theoretical study provides best, average, and worst case system resource consumption and also execution time data in theory. The empirical study is based on actual measurements of system resource consumptions and execution times on various input data over several common hardware configurations for both client and server workstations.


Unicode
When the world wants to talk, it speaks Unicode

UnicodeIUC19
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

22 Jun 2001, Webmaster