Re: Merging combining classes, was: New contribution N2676

From: jon@hackcraft.net
Date: Thu Oct 30 2003 - 04:45:13 CST


> On 29/10/2003 15:07, John Cowan wrote:
>
> >Not necessarily. A process may check its input for normalization and
> >reject it if it is not normalized, and XML consumers are encouraged
> >(not required) to do so.
> >
> >
> >
> This looks to me like a clear breach of C9, at least of the derived
> principle
>
> > no process can assume that another process will make a distinction
> > between two different, but canonical-equivalent character sequences.
>
> Another process may not be assumed to make a distinction between
> normalised and non-normalised forms and so may not be assumed to
> normalise, accurately or at all.

It's perfectly reasonable if the a specification calls for input to be in a
particular normalisation form for the process to reject input that isn't. In
requiring a particular normalisation form you are adding a requirement for the
data in addition to those entailed by saying the data is in Unicode, which is
no different that adding a requirement that particular characters be given
particular meaning above the semantics they have as Unicode characters (e.g.
XML does this with < and >).

This extra requirement is supplied "on top of" C9, and encountered before C9
comes into play. Similarly there is no *assumption* about the treatment of
canonical-equivalent character sequences; rather there is a specification
proscribing the use of NFC and allowing processes to reject non NFC data.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST