UnicodeIUC19
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
Abstract

An ICU OpenSource Library Supporting the Display of Complex Scripts

Eric Mader - IBM's Globalization Center of Competency

Intended Audience: Software Engineer, Systems Analyst
Session Level: Intermediate

Some scripts require rendering behavior that is more complicated than the Latin script. We refer to these scripts as "complex scripts" and to text written in these scripts as "complex text." Examples of complex scripts are the Indic scripts (e.g. Devanagari, Tamil, Telugu and Gujarati), Thai and Arabic.

The ICU LayoutEngine is an OpenSource library that provides a uniform, easy to use interface for preparing complex text for display. The LayoutEngine code is independent of the underlying platform's font and rendering architecture. All access is through an abstract base class. A concrete instance of this base class must be implemented for each platform.

Complex text has four main complications:

  • contextual forms - some characters are displayed using different shapes depending on the surrounding characters
  • ligatures - some sequences of characters are displayed using a special form called a ligature
  • reordering - the order in which the characters display may be different from the spoken or "logical" order
  • positioning - some characters may require horizontal and / or vertical positioning adjustments to display correctly, such as accent marks placed above base characters

Another complication is that in most cases, the contextual and ligature forms of characters have not been assigned Unicode code points and so cannot be displayed directly using code points.

The ICU LayoutEngine handles these complications in one of four ways:

  • if the font contains OpenType tables, it will use those
  • if the font contains Apple Advanced Typography tables, it will use those
  • for Arabic and Hebrew text, it will use the Unicode Presentation forms, if they're present in the font
  • since Thai text has no defined OpenType behavior, it is handled using special-purpose rules that can handle either the Microsoft or Apple Thai encoding


Unicode
When the world wants to talk, it speaks Unicode

UnicodeIUC19
Unicode Standard Conference Board Past Conferences Call for Papers Sponsors Showcase
Registration Accommodation Travel Program Talks and Papers Next Conference
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

22 Jun 2001, Webmaster