Re: Opinions on this Java URL?

From: Asmus Freytag (
Date: Mon Nov 15 2004 - 00:37:21 CST

  • Next message: Doug Ewell: "Re: U+0000 in C strings (was: Re: Opinions on this Java URL?)"

    At 10:01 PM 11/14/2004, Doug Ewell wrote:
    >Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
    > > There are some UTF-8/UTF-16 interoperability aspects that are
    > > addressed by CESU-8. These concerns are real, and affect multi-
    > > component architectures that must interchange data across component
    > > boundaries. Therefore a standard specification serves a useful
    > > purpose.
    >I understand that there are strings in databases originally sorted in
    >UTF-16 code point order, for whatever reason, and that these strings
    >need to stay sorted in the same order when converted to UTF-8. I happen
    >to believe this should be handled in the sort routine, not by inventing
    >a new character encoding scheme, and said so at the time. The code to
    >perform the necessary transformation is quite small and fast (I think it
    >was one of the Markuses who demonstrated this).

    There were extensive discussions on this issue in the UTC and the arguments
    you cite were brought forward, but there were counter arguments to them.
    It serves little purpose to do a play-by-play re-enactment of that discussion
    here on the mailing list. Suffice it to say that the duly appointed
    representatives of the member companies came to a proper decision after
    evaluating the information placed before them.

    In the two years since publication of the report, there's not been the
    preponderance of troubles that were one of the predicted outcome. Nor
    is there widespread confusion it with one of the three encoding forms that are
    conformant with Unicode. For recent documents (6 months) there are 720 hits
    for CESU-8 on google. To give that perspectvive, consider the 2,820,000 hits
    that come up for the same time frame when you search for UTF-8.

    Let's drop this, or take it offline from here, shall we?


    This archive was generated by hypermail 2.1.5 : Mon Nov 15 2004 - 00:39:01 CST