The beta directory for the Unicode 3.0.1 update has been created.
Due to the current problem with anonymous ftp on www.unicode.org,
only the http version of this directory is currently available:
http://www.unicode.org/Public/3.0-Update1/
The updated beta files at that location for the Unicode 3.0.1 update are:
5179 Jul 31 21:14 ArabicShaping-3d1.beta.txt
43559 Jul 31 21:14 CaseFolding-2d1.beta.txt
5085 Jul 31 21:14 CompositionExclusions-2d1.beta.txt
55254 Jul 31 21:14 PropList-3.0.1d2.beta.txt
13841 Jul 31 21:14 SpecialCasing-3d2.beta.txt
48261 Jul 31 21:14 UnicodeData-3.0.1d1.beta.html
636269 Jul 31 21:15 UnicodeData-3.0.1d2.beta.txt
These are temporary names. Once the beta review closes, the "beta" and
the delta number on the files will be dropped for the permanent
versioned filename, and the latest versions of the files will
be copied into the UNIDATA directory minus the version extension.
And comparable changes will be made in the ftp hierarchy as well, as
soon as regular ftp service can be restored on the server.
Before that happens, however, we would like to invite all interested
implementers to examine the data files and report any problems you
find in them, so that any problem can be corrected before the finalization
of the Unicode 3.0.1 update.
Note that UnicodeData.txt and PropList.txt now explicitly contain
codepoint listings using the 5- or 6-digit UTF-32 notation. If
you are using automated parsers on either of those files, be aware
of this change in convention and make sure your code is prepared
to handle parsing of codepoint values greater than 0xFFFF.
We are introducing this change now with the relatively trivial
listing of user-defined, unassigned, and not-a-character codepoints
past U+FFFF, so people can test out their implementations before
they get whumped with 40,000+ new characters from Planes 1, 2,
and 14 for the upcoming Unicode 3.1.
--Ken Whistler
===================================================================
The changes in the data files from the 3.0.0 release version are
as follows:
ArabicShaping.txt
Updated the shaping class for 0671.
CaseFolding.txt
This is a new contributory data file. See UTR #21, Case Mappings.
CompositionExclusions.txt
Fixed a comment in the file.
Added a minimal label/version comment at the top of the file.
PropList.txt
Removed F8F0..F8FF from a listing of several properties. (Bug)
Fix the default bidi property to LR for all user-defined character
ranges.
Updated properties for 0E47. (removed from alphabetics, added to
diacritics)
Extended property listing to full UTF-16 range for user-defined
characters (including Planes 15 and 16), for bidi LR, and
for unassigned characters.
Added not-a-character property (a property of codepoints, not
of characters), and provided listing for full UTF-16 range.
SpecialCasing.txt
Minor fixes to the BNF syntax.
Addition of Lithuanian AFTER condition.
Addition to notes in the comments in the file.
UnicodeData.html
Corrected a bullet numbering problem.
Added documentation of range listing for Plane 15 and Plane 16
user-defined characters.
Added documentation of 4/5/6 digit hex notation conventions.
UnicodeData.txt
Added definition ranges for Plane 15, and Plane 16 user-defined
characters.
Added "dena sum" in the ISO comment field for 0FCF.
Added 10646-1 Annex P asterisk comments to 01A6, 0280.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT