Unicode Character Database 5.0 and Unicode Collation Alghorithm 5.0 Released

From: Magda Danish \(Unicode\) (v-magdad@microsoft.com)
Date: Tue Jul 18 2006 - 09:52:35 CDT

Next message: Sinnathurai Srivas: "Re: Frequent incorrect guesses by the charset autodetection in IE7"

Previous message: Erkki Kolehmainen: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Mountain View, CA, July 18, 2006 -- The Unicode® Consortium announces the release of a significant update of its widely-used Unicode Character Database (UCD). The new version, Version 5.0, defines more than 99,000 characters for the languages of the world, and provides the detailed properties needed for computer software implementations. This latest level of the UCD contains all the information needed to update software to support the characters and algorithms that are the foundation for all modern computer programs - including the latest data for Unicode security mechanisms, collation, and locales.

For the first time, the Unicode Collation Algorithm (UCA) is released in parallel with the UCD - both UCA Version 5.0 and UCD Version 5.0 are available simultaneously, enabling default collation for all 99,000 characters. For more information on UCA 5.0, see http://www.unicode.org/reports/tr10.

Implementers are now able to more quickly update their software to fully support minority languages, improved Indic processing, and the newly published subset of most useful Chinese characters for mobile and small applications, IICore.

Version 5.0 of the UCD opens the power of the Common Locale Data Repository (CLDR) Version 1.4 - 360 locales (121 languages and 142 territories) are now supported. The systemization and extension of character properties will enable improved text processing for all CLDR locales.

In this latest version, the UCD data provides dependable caseless matching through stable case folding operations. Version 5.0 data is also the basis for better interoperability for bidirectional scripts (such as Arabic and Hebrew), line breaking, and text segmentation.

With the release of this data, implementers can begin to move their software to Version 5.0, in anticipation of the Unicode Version 5.0 support that will ship with many products and libraries, including Windows Vista, ICU, offerings from Google, Yahoo! and many other companies.

The release of Version 5.0 of the UCD is the first step in the release of The Unicode Standard, Version 5.0 - the book (ISBN 0-321-48091-0) will be published by Addison-Wesley in the 4th quarter of 2006. For more information about Unicode 5.0 and the Unicode Character Database, see http://www.unicode.org/versions/Unicode5.0.0/.

The latest features of Unicode Version 5.0 will also be showcased at the 30th Internationalization and Unicode Conference (IUC) on November 17-19, 2006 in Washington, D.C. -- see http://www.unicodeconference.org/.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry: Adobe Systems, L'Agence intergouvernementale de la Francophonie, Apple Computer, Basis Technology, Denic e.G., Google, Government of India - Ministry of Information Technology, Government of Pakistan - National Language Authority, HP, IBM, Justsystem, Microsoft, Monotype Imaging, Oracle, SAP, Sun Microsystems, Sybase, The University of California at Berkeley, Yahoo, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium (http://www.unicode.org/).

---------------------------

Magda Danish
Sr. Administrative Director
The Unicode Consortium
650-693-3921
magda@unicode.org

Next message: Sinnathurai Srivas: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Previous message: Erkki Kolehmainen: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Jul 18 2006 - 10:01:02 CDT