BETA Unicode 5.1.0
The next version of the Unicode Standard will be Version 5.1.0.
The beta version of the documentation for Unicode 5.1.0 is located
in:
http://www.unicode.org/versions/Unicode5.1.0/
This version is planned for release in
March 2008. A beta version of the 5.1.0 Unicode Character Database
files is also available for public comment. We strongly encourage
implementers to download these files and test them with their
programs, well before the end of the beta period. These files are located in:
http://www.unicode.org/Public/5.1.0/
ftp://www.unicode.org/Public/5.1.0/
For guidance on how to focus your review, see the section
Notable Issues for Beta Testers below.
Any comments on the beta Unicode 5.1.0, the UCD 5.1.0, or the
5.1.0 UAXs should be
reported using the Unicode
reporting form. The comment period ends
January 28, 2008.
All substantive comments must be received by that date for
consideration at the next UTC meeting. Editorial comments (typos,
etc) may be submitted after that date for consideration in the final
editorial work.
Note: All beta files may be updated, replaced, or
superseded by other files at any time. The beta files will be
discarded once Unicode 5.1.0 is final. It is inappropriate to
cite these files as other than a work in progress.
The Unicode Consortium provides early access to updated versions of the data files
and text to give reviewers and developers as much time as possible to ensure a problem-free adoption of version 5.1.0.
The assignment of characters for Unicode 5.1.0 is now stable. There will be no further additions or modifications of code points.
One of the main purposes of the beta review period, however, is to verify and correct the preliminary character property assignments in the Unicode Character Database. Reviewers should check for property changes to existing Unicode 5.0.0 characters, as well as the property values for the new Unicode 5.1.0 character additions.
The beta review period is a good opportunity to add support for the new
Unicode 5.1.0 characters in internal versions of software, so that software can
be tested to verify that the new characters and property assignments don't cause
problems when upgraded to Version 5.1.0 of Unicode.
However, because the Unicode Character Database files will be updated during the beta review period, before the final Version 5.1.0 is released, no products or implementations should be released based on the beta data files. For released products, use the final, approved Version 5.1.0 data files, expected in March, 2008.
Notable Issues for Beta Testers
Some links between beta
documents and UAXs may not work correctly, as they might be links
to documents with final names or revision numbers. This list is
preliminary. More issues may be added during the beta period.
All Unicode Standard Annexes are being modified in
Unicode 5.1.0, and may be coordinated with changes in properties. To see the
current proposed updates to the particular UAXs, see
Technical Reports
or use the links on the navigation bar of this page.
Particular issues in the UAXs are also the focus of specific
Public Review Issues.
Each proposed change in a UAX is highlighted, so that you can focus
your review on those sections if you have limited time. The changes
are also listed in each Modifications section (linked from the table
of contents), so you can check on those areas that might be of most
interest.
Beta reviewers often
overlook the fact that the text in UCD.html is also updated and
needs careful checking. To review, go to
http://www.unicode.org/Public/5.1.0/ucd/ , and look for files of
the form UCD-5.1.0dn.html. There are other documentation (.html)
files which also need review in the ucd directory and its subdirectories as well.
Check carefully for any hard-coded range assumptions about
Unified CJK Ideographs, because the end range for those has changed from U+9FBB to U+9FC3 in this version.
UTS #10:
Unicode Collation Algorithm (UCA) determines the default sorting
order for Unicode characters. The data for UCA is kept in sync with
successive versions of Unicode, and given the same version number.
Thus the 5.1.0 version of UTS #10 is available for review during the
same period as Unicode 5.1.0. The draft data for UCA 5.1.0 is found
in
http://www.unicode.org/Public/UCA/5.1.0/.
To facilitate the migration of products to the final release version of the Unicode Character Database files, dated, diffable
XML versions of the Unicode Character Database will be made available, so that
implementers can check the details of any changes that occurred during the beta
review period. The XML files are in the
http://www.unicode.org/Public/5.1.0/diffs/ directory. For more information, see the
readme.txt
file.