From: Kenneth Whistler (email@example.com)
Date: Wed Oct 13 2004 - 12:30:29 CST
> Jon Hanna wrote:
> >>imported UTF-8 sequences like [U+0065][U+0303] <e, tilde> get
> >>internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE.
> >>Is this kind of behavior what one would expect?
> >That's conformant, if it causes problems with any other process (including
> >other processes that are part of the system in question) then that other
> >process isn't complying with conformance clause C9.
And Eric Muller asked:
> But what if U+1ebd is not part of the repertoire supported by that other
Ah, but there is "support" and then there is "support".
A conformant implementation can pick and choose the repertoire is
supports for some text processes, e.g. for display. No font is
required to support display of *all* Unicode characters, and
that could perfectly well apply to U+1EBD.
However, implementations don't get to pick and choose so easily
about aspects of the standard such as encoding forms and normalization.
You can't, for example, recognize that <U+006E, U+0303> is canonically
equivalent to U+00F1 (ñ), but claim *not* to recognize that
<U+0065, U+0303> is likewise canonically equivalent to U+1EBD, simply
because U+1EBD is not in a range that your implementation chooses
to "interpret" for display. Such, broken, partial recognitions of
canonical equivalence would represent non-conformant implementations
of normalization. That is also why most implementations should depend
on library code for normalization, where the library code specifically
claims to be a conformant implementation of normalization -- and
handles *all* Unicode characters correctly.
This archive was generated by hypermail 2.1.5 : Wed Oct 13 2004 - 12:32:55 CST