From: Asmus Freytag (email@example.com)
Date: Mon Nov 15 2004 - 00:37:21 CST
At 10:01 PM 11/14/2004, Doug Ewell wrote:
>Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
> > There are some UTF-8/UTF-16 interoperability aspects that are
> > addressed by CESU-8. These concerns are real, and affect multi-
> > component architectures that must interchange data across component
> > boundaries. Therefore a standard specification serves a useful
> > purpose.
>I understand that there are strings in databases originally sorted in
>UTF-16 code point order, for whatever reason, and that these strings
>need to stay sorted in the same order when converted to UTF-8. I happen
>to believe this should be handled in the sort routine, not by inventing
>a new character encoding scheme, and said so at the time. The code to
>perform the necessary transformation is quite small and fast (I think it
>was one of the Markuses who demonstrated this).
There were extensive discussions on this issue in the UTC and the arguments
you cite were brought forward, but there were counter arguments to them.
It serves little purpose to do a play-by-play re-enactment of that discussion
here on the mailing list. Suffice it to say that the duly appointed
representatives of the member companies came to a proper decision after
evaluating the information placed before them.
In the two years since publication of the report, there's not been the
preponderance of troubles that were one of the predicted outcome. Nor
is there widespread confusion it with one of the three encoding forms that are
conformant with Unicode. For recent documents (6 months) there are 720 hits
for CESU-8 on google. To give that perspectvive, consider the 2,820,000 hits
that come up for the same time frame when you search for UTF-8.
Let's drop this, or take it offline from here, shall we?
This archive was generated by hypermail 2.1.5 : Mon Nov 15 2004 - 00:39:01 CST