UnicodeIUC24
ProgramShowcaseRegistrationAccommodationTravelSponsors
Unicode StandardConference BoardConference CDLast ConferencePast ConferencesNext Conference
Abstract

What's New in Unicode 4.0

Mark Davis - IBM Corporation

Intended Audience: Software Engineers
Session Level: Beginner, Intermediate, Advanced

Unicode 4.0.0 is the newest major version of the Unicode Standard, including a significant update of its widely-used Unicode Character Database. Version 4.0 defines over 96,000 characters for the languages of the world, and provides detailed properties and algorithms for computer systems. The current release contains all the information needed to update software to support the latest characters.

As a significant step towards the digital preservation of world heritage, this new version encodes characters for Linear B and other ancient Mediterranean alphabets. At the same time, it expands support for modern minority languages. This removes a major barrier that has prevented people from using their own languages on computers.

The text of the standard and the Unicode Standard Annexes has undergone substantial revision. In particular, the Unicode Character Encoding Model is incorporated, resulting in fully specified definitions and conformance requirements of UTF-8, UTF-16, and UTF-32. These are also clearly contrasted with the in-process use of Unicode Strings. Other changes include program identifiers, bidi, linebreaking and other boundaries, case conversions and detection, and scripts.

In this version, 1,226 new character assignments were made (over and above what was in Unicode 3.2). In the Unicode Character Database, this version introduces the concept of provisional properties, clarifies the relationships between properties, and provides precisely defined fallback properties for characters not explicitly defined in the data files. A number of corrections to properties were also incorporated, and the UCD documentation was combined and improved.

This presentation discusses the changes to the text of the standard, and outlines the changes made in the Unicode Character Database and in the Unicode Standard Annexes.

Unicode
When the world wants to talk, it speaks Unicode

UnicodeIUC24
ProgramShowcaseRegistrationAccommodationTravelSponsors
Unicode StandardConference BoardConference CDLast ConferencePast ConferencesNext Conference
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

30 May 2003, Webmaster