From: Simon Josefsson (firstname.lastname@example.org)
Date: Wed Jan 26 2005 - 13:57:18 CST
"Marcin 'Qrczak' Kowalczyk" <email@example.com> writes:
> Simon Josefsson <firstname.lastname@example.org> writes:
>> This change appear to break backwards compatibility and normalization
>> stability. The PR29 text suggest that the problematic sequences do
>> not occur naturally. My question then is: why break normalization
>> stability over something that doesn't appear to be a practical
> Because normalizations should be idempotent. This was always intended,
> the old specification had a bug.
I think there are two kinds of idempotency under discussion:
The first, "internal-idempotency", is that NFKC(NFKC(x)) = x.
The second, "version-idempotency", is that NFKC3.2(NFKC4.0(x)) = x.
The #61 proposal trade the second for the first.
If you look at TR15, section 3 Versioning and stability, the first
It is crucial that normalization forms remain stable over time. That
is, if a string that does not have any unassigned characters is
normalized under one version of Unicode, it must remain normalized
under all future versions of Unicode. This is the backwards
The requirement, the version-idempotency, appear to be violated, in
order to achieve the internal-idempotency.
Nowhere in the current document can I find any text that say that
internal-idempotency was a design goal or even a requirement. The #61
review issue mention these goals in an annex -- is that even part of
the normative text?
> It happens that it affected my implementation of normalization that
> I've made for my language. I already fixed it. Are you saying that I
> should break it again?
What are you using normalization for? If it is for StringPrep,
including internationalized domain names, you should revert your fix
because StringPrep use 3.2 without the proposed update.
>> However, I am concerned that normalization stability is given so
>> little weight that it is violated even for situations that doesn't
>> appear to have practical consequences.
> I am more concerned with maintaining bugs forever in the name of
Right, it is a trade-off. If you care more about internal-idempotency
than version-idempotency, I understand.
> If this particular change can have practical consequence, it's more
> probable that something will break with the old definition (because
> a subsystem relied on idempotency) than with the new one.
This is a conclusion that I have failed to reach.
Several IETF protocols are being modified to use StringPrep today,
which use the old normalization. When/if StringPrep is updated to use
the new normalization, those protocols appear to be faced with an
I have not seen enough discussion about this problem in public to make
me comfortable about this change. If there was a plan on handling the
upgrade-problem, I would be more comfortable.
This archive was generated by hypermail 2.1.5 : Wed Jan 26 2005 - 13:59:40 CST