From: Doug Ewell (email@example.com)
Date: Sat Dec 10 2005 - 10:09:45 CST
Richard Wordingham <richard dot wordingham at ntlworld dot com> wrote:
> The complaint wasn't about the format; it was about the content, i.e.
> the lack of any version marking. The only of doing that would be a
> hack like abusing a Unicode_1_Name. However, that could break an
> application that knows how many characters there were in Unicode 1.0,
> as could supplying a spurious Unicode_1_Name for a character outside
> the BMP. (My first though for such a hack was to put copyright
> information in the Unicode_1_Name for U+10FFFD or U+10000.)
When Ken and others say they can't add comments or a version record to
UnicodeData.txt because doing so would break existing parsers, that
means the file format does not provide for comments or a version record.
Support (or lack of support) for comments is just as much a part of the
file format as whether this field comes before that one.
When I talk about converting the file you get into the file you want, it
could be something as simple as adding a comment line containing the
version number, or adding copyright information to an existing record
where your apps know to treat it specially. Or it could mean converting
the file to XML or DBF, or inventing an entirely new format. The
important thing is the decision to convert the file UTC gives you and
use the converted version.
In the case of adding version information to a file that doesn't already
have it, it is a matter of manually supplying the information, based on
personal knowledge or file size or whatever. The advantage is that you
only have to do it once, and don't have to embed the logic into your
-- Doug Ewell Fullerton, California, USA http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Sat Dec 10 2005 - 10:11:41 CST