From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Dec 08 2005 - 16:35:33 CST
Werner Lemberg asked:
> UnicodeData.txt is, as far as I know, the central file describing the
> properties of the Unicode characters. As such it is tightly bound to
> the corresponding Unicode version, and I wonder why one of the most
> important elements, namely a version tag, is missing from this file.
> I consider this as a serious problem. Similarly, a copyright notice
> together with a license should be included, even if it just points to
> a URL holding the complete text.
It is a legacy format issue. UnicodeData.txt was the very first
of the data files defined for the Unicode Standard -- many years
ago. And there are many existing processes that parse it exactly
as is. To minimize the problems of compatibility going forward,
its format has been frozen for a long time -- and that includes
not adapting the comment and version conventions that the other
data files have.
The versions of all instances of UnicodeData.txt files are
clear by context in the ftp://www.unicode.org/Public/ directories,
so if you have a "loose" copy of UnicodeData.txt that you
are unsure about its version, that can always be determined
by comparing dates and sizes against the versions in the
Unicode Character Database, or for absolute certainty, by
diffing contents.
>
> I've only looked at version 4.1.0 -- maybe you've fixed this
> meanwhile.
No, this will not be changed in Unicode 5.0.0.
--Ken
This archive was generated by hypermail 2.1.5 : Thu Dec 08 2005 - 16:39:00 CST