From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Jan 26 2005 - 17:51:28 CST
On 26/01/2005 19:57, Simon Josefsson wrote:
> ...
>
>
>I think there are two kinds of idempotency under discussion:
>
>The first, "internal-idempotency", is that NFKC(NFKC(x)) = x.
>
>
Surely you mean NFKC(NFKC(x)) = NFKC(x), and similarly for the following.
>The second, "version-idempotency", is that NFKC3.2(NFKC4.0(x)) = x.
>
>The #61 proposal trade the second for the first.
>
>
>
No, because the latter "version-idempotency" is not true even if NFKC4.0
is identical to NFKC3.2, because the latter is not internal-idempotent.
Anyway, this is not the open review issue #61 but the long closed review
issue #29.
But I agree with you that there is something very odd about a situation
in which Unicode refuses point blank to consider correcting certain
errors e.g. (arguably) mis-assigned combining classes, on the basis that
in theoretical cases they break normalisation stability, but yet allowed
this correction. My solution, however, would be the opposite of yours:
Unicode should recognise that there can be no absolute guarantee of
normalisation stability from one version of the standard to another, and
recommend that any string should be normalised before any operation
which depends critically on normalisation.
In fact Unicode's own conformance requirements more or less forbid a
process from relying on normalisation stability, i.e. depending on a
string being already normalised, because that implies that the process
is making a distinction between canonically equivalent representations.
So what is the point of insisting on normalisation stability?
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.4 - Release Date: 25/01/2005
This archive was generated by hypermail 2.1.5 : Wed Jan 26 2005 - 18:16:04 CST