Re: Unicode Search Engines

From: Martin Kochanski (unicode@cardbox.net)
Date: Thu Feb 21 2002 - 06:50:13 EST


What happens if Unicode 3.3 defines a new precomposed character (such as q-tilde)? Does this mean that all existing documents might become retrospectively unnormalised and therefore invalid?

At 13:21 20/02/02 -0500, John Cowan wrote:
>The W3C CharMod wants receivers to check normalization and
>reject unnormalized documents, *not* to normalize input. Silently
>normalizing input can conceal the existence of a security-related
>spoof that is NFC-equivalent to a genuine document.
>It is essentially the same reason that broken HTML or broken UTF-8
>should not be silently repaired.
>



This archive was generated by hypermail 2.1.2 : Thu Feb 21 2002 - 07:52:36 EST