Re: outside decomposed, inside precomposed

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Oct 13 2004 - 12:30:29 CST

Next message: Eric Muller: "Re: outside decomposed, inside precomposed"

Previous message: Mike Ayers: "RE: outside decomposed, inside precomposed"
Maybe in reply to: Richard Cook: "outside decomposed, inside precomposed"
Next in thread: Eric Muller: "Re: outside decomposed, inside precomposed"
Reply: Eric Muller: "Re: outside decomposed, inside precomposed"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Jon Hanna wrote:
>
> >>imported UTF-8 sequences like [U+0065][U+0303] <e, tilde> get
> >>remapped
> >>internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE.
> >>
> >>Is this kind of behavior what one would expect?
> >>
> >>
> >
> >That's conformant, if it causes problems with any other process (including
> >other processes that are part of the system in question) then that other
> >process isn't complying with conformance clause C9.
> >
> >

And Eric Muller asked:

> But what if U+1ebd is not part of the repertoire supported by that other
> process?

Ah, but there is "support" and then there is "support".

A conformant implementation can pick and choose the repertoire is
supports for some text processes, e.g. for display. No font is
required to support display of *all* Unicode characters, and
that could perfectly well apply to U+1EBD.

However, implementations don't get to pick and choose so easily
about aspects of the standard such as encoding forms and normalization.
You can't, for example, recognize that <U+006E, U+0303> is canonically
equivalent to U+00F1 (ñ), but claim *not* to recognize that
<U+0065, U+0303> is likewise canonically equivalent to U+1EBD, simply
because U+1EBD is not in a range that your implementation chooses
to "interpret" for display. Such, broken, partial recognitions of
canonical equivalence would represent non-conformant implementations
of normalization. That is also why most implementations should depend
on library code for normalization, where the library code specifically
claims to be a conformant implementation of normalization -- and
handles *all* Unicode characters correctly.

--Ken

Next message: Eric Muller: "Re: outside decomposed, inside precomposed"
Previous message: Mike Ayers: "RE: outside decomposed, inside precomposed"
Maybe in reply to: Richard Cook: "outside decomposed, inside precomposed"
Next in thread: Eric Muller: "Re: outside decomposed, inside precomposed"
Reply: Eric Muller: "Re: outside decomposed, inside precomposed"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Oct 13 2004 - 12:32:55 CST