Unicode StandardConference BoardConference CDLast ConferencePast ConferencesNext Conference

Unicode 4.0 Tutorial

Asmus Freytag - ASMUS, Inc.

Intended Audience: All
Session Level: Beginner, Intermediate, Advanced

The Unicode Tutorial

The Unicode Standard is a universal character encoding and as such it is of some complexity. It has just undergone a major revision. Asmus Freytag, one of the longtime contributors to the standard has redeveloped his Unicode tutorial to bring it up-to-date with Unicode 4.0.

Unicode 4.0 Tutorial: Part I - Core Concepts in Action

The first part of the Unicode Tutorial is a uniquely accessible and entertaining way to visualize the core concepts of the Unicode standard. In this part you will find answers to these questions:

  • What is a Unicode character?
  • How are Unicode characters represented?
  • How do Unicode character codes fit into a modern computing environment?
  • How are Unicode characters interchanged?
  • What is the interaction between Unicode and rich text (markup)?
  • What are Unicode character properties, and why are they important?
  • How do end-users experience Unicode?

Throughout this part Unicode Tutorial highlights gives typical examples of how the Unicode Standard interacts with the other elements of an internationalized software architecture. With the help of concrete scenarios for the use of Unicode characters you will become familiar with the role the Unicode Standard plays and the benefits of supporting it.

This part is accessible to and recommended for audiences of all backgrounds.

Unicode 4.0 Tutorial: Part II - Fundamental Specifications

Part II of the Unicode Tutorial builds on the concepts introduced in Part I and systematically presents the details of fundamental specifications that are part of the Unicode Standard.

  • What is the organization of the Unicode Code Space?
  • What are the principles used to allocate and unify characters?
  • What is Han Unification?
  • What is a Unicode Encoding form?
  • What is the actual definition of UTF-8, UTF-16, UTF-32?
  • What is a byte order mark?
  • Which encoding form should I select?
  • What is the Unicode Character Property Model?
  • Where are all the pieces that make up the Unicode Standard?

Part II is recommended for anyone interested in more detailed information.

Unicode 4.0 Tutorial: Part III - Unicode Algorithms

The Unicode Standard and related specifications by the Unicode Consoirtum specify a number of algorithms. The specification of these algorithms in the Unicode Standard depends on the Unicode Character Properties. Part III surveys the algorithms specified in the Unicode Standard, and extends the discussion of Unicode character properties as they relate to each algorithm. Part III provides answers to these questions:

  • What is a Unicode Algorithm ?
  • How is an abstract algorithm different from an actual implementation?
  • How does it relate to Unicode Character Properties?
  • What is Unicode Normalization?
  • What requirements does it address?
  • What is a Unicode Normalization form?
  • What is the actual specification of NFC, NFD, NFKC, NFKD?
  • What do I need to know in applying normalization?
  • How does Normalization interact with the web?
  • What is the Unicode Bidirectional Algorithm?
  • How is it defined and how does it interact with other text layout tasks?
  • When do I need to support it?
  • How do I determine text boundaries and line breaks?
  • What are the issues?
  • What resources does the Unicode Standard provide?
  • Is any specific type of support required?
  • What are character foldings?
  • How does case mapping work?
  • How do character transformation interact with Normalization.

Part III is very detailed and will touch on the description of algorithms and other material that may require some familiarity with technical concepts.

When the world wants to talk, it speaks Unicode

Unicode StandardConference BoardConference CDLast ConferencePast ConferencesNext Conference
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

30 May 2003, Webmaster