This session offers three separate mini-tutorials that will cover the topics of Characters, Glyphs & Rendering.
What You Need To Know About Processing and Rendering Multilingual Text
The advent of multilingual information processing with Unicode requires the designer to have a deeper knowledge of rendering characters for display and printing than is necessary for a single script, like Latin. Rendering technology that is adequate for a language of the Latin script, like English, may prove totally inadequate for scripts such as Arabic or Devanagari. This presentation introduces a framework to characterize a character in terms of its information, associated shape (or glyph) and the relationships between these two attributes. It first differentiates between the domains of characters and of glyphs, and when it is appropriate to do processing in one domain versus the other. Next, it describes three different technologies used to render Unicode characters into glyphs. Finally, it describes several considerations for design.
The Unicode Character-Glyph Model and Rendering Complex Scripts
Statement of Purpose: This is a slightly expanded and refocused version of the session on complex Unicode rendering which I've given at the past three IUC's. It's now aimed at being a tutorial and discusses some of the requirements in greater depth.
Summary: From its beginnings, Unicode has made an explicit separation between the processes of text generation, text storage, and text rendering. The division between text storage and text rendering is perhaps the most fundamental and is explicitly formulated as the character-glyph model.
Although it's possible to represent most Western European and East Asian languages on computers without the character-glyph distinction, there are a number of scripts, notably the various Levantine and South Asian scripts where this cannot be done. Moreover, the very large number of accented Latin letters in actual use also forces Unicode-based systems to take the character-glyph separation into account, as well as the needs of high-end Western typography.
Specific examples from various writing systems illustrating the need for character-glyph separation will be given. Specific Unicode implementations will also be referenced to show how this model is taken into account and how application developers can provide support for it in their own programs.
Character Sets and Encodings
The computing world is made up of heterogeneous systems, each with its own set of character sets and/or encodings. Until such time as the world speaks Unicode, it will be necessary to understand how these character sets are structured, and how they interact with one another.
This talk will attempt to explain the format and contents of the most
common character set encodings, including:
The goal of this talk is to provide the listener with the basic knowledge required to understand this modern Tower of Babel, without becoming too confused by the plethora of proprietary, national and international standards.
|When the world wants to talk, it speaks Unicode|
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.
25 January 1999, Webmaster