From: Kenneth Whistler (firstname.lastname@example.org)
Date: Mon Apr 04 2005 - 17:22:51 CST
John Burger asked:
> >> The problem will of course come when new UCD data is fed into an old
> >> normaliser.
> > Actually, it will not. If a Unicode normalizer was a Unicode 4.0
> > normalizer, it will *stay* a Unicode 4.0 normalizer.
> Even if it is fed new ==UCD== data?
It depends on what Peter Kirk meant by a "normaliser" and
by "UCD data".
If by "normaliser" he means a normalizer generator that takes
UCD data files as input and generates a normalizer process that
corresponds to the version of UCD data files, then of course
what you input matters.
If by "normaliser" he means an already implemented normalizer
process and by "new UCD data" he means text data corresponding
to the new version of Unicode, then the behavior of the
normalizer should not change.
I wouldn't be surprised if a normalizer *generator* were broken
by a new version of the UCD data files corresponding to a new
version of Unicode. After all, most of them were broken by
Unicode 3.1 in the first place, if you recall.
But I consider a tool generator in a different class than
a final application that an end user interacts with. Anybody
who uses a tool generator and who then doesn't test the tool
(in this case a normalizer process) that it outputs for
conformance to the version of the standard it supposedly
supports -- again deserves what they get. And if the tool
generator breaks on a new generation of UCD data files, that
should be a pretty good sign that they've got some work to
do before it is going to produce a conformant tool.
Note, for example, that anyone who tried to implement a
fully generalized normalization generator based on UCD
data files would have had it broken by the introduction
of NormalizationCorrections.txt as a UCD data file in
Unicode 3.2.0. That was a rather more serious departure in
input than depending on the assumption that any character
on the BMP would normalize completely to characters in the
This archive was generated by hypermail 2.1.5 : Mon Apr 04 2005 - 17:24:10 CST