Re: Opinions on this Java URL?

From: Doug Ewell (dewell@adelphia.net)
Date: Sun Nov 14 2004 - 01:36:18 CST

  • Next message: Asmus Freytag: "Re: Opinions on this Java URL?"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    >> What is a shame is that Unicode published a definition of the
    >> defective CESU-8 at all.
    >
    > On that point at least we agree. I wonder why CESU-8 was created, if
    > there effectively exists applications needing it.

    UTC could have simply acknowledged that certain applications and vendors
    have created their own transformation formats for internal use, based
    on, but incompatible with, existing Unicode encoding schemes. Oracle
    has a UTF-8-like one which encodes supplementary code points with six
    bytes instead of four. Sun has one like this which also encodes U+0000
    as two bytes instead of one. Someone else might decide to use one of
    the "zany" UTFs invented by Marco Cimarosti or me.

    Whatever... but there was no need to publish a Technical Report
    describing Oracle's custom format, giving it a formal-sounding name like
    "CESU-8" and registering it as an IANA charset for interchange. Not
    everyone outside this list is familiar with the fine distinction between
    a UTR, officially approved by UTC, and a UTN, published but not approved
    by UTC. I hope UTC does not ever go the "CESU-8" route with a UTN
    describing Sun's broken format.

    > On the other side, the Java modified UTF-8 (in fact more near from
    CESU-8)
    > has proven to be useful and is widely used... Simply because it is
    > compatible with standard C libraries for null-terminated strings.

    An unusual type of "compatible" that makes a special allowance for
    strings with embedded nulls, impossible by definition in C.

    If the Java architects had wanted a variable-length array of arbitrary
    byte data, they should have created such a type in the first place,
    instead of overloading the string type. Strings are for text. Text
    does not need nulls.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Sun Nov 14 2004 - 01:40:16 CST