Re: Opinions on this Java URL?

From: Theodore H. Smith (
Date: Fri Nov 12 2004 - 18:05:20 CST

  • Next message: A. Vine: "Re: Opinions on this Java URL?"

    I take your point that you are well aware of this. However some of your
    users are not so aware, having read your information on "Modified
    UTF-8" and thinking "hey, well is Sun do it, then it must be OK for me
    to do it too!"

    This thread, was inspired by exactly that. Someone point me to this
    page, using it as "proof" that modified UTF-8 is an acceptable thing to

    While you are well aware, the users aren't. I think it would be a good
    idea to add a small note saying that this feature is going to be
    changed in future versions of Java, or perhaps Deprecated, due to its
    incompatibility. Just a small note, on that page and similar pages,
    with the phrase "This will be deprecated in the future because it
    currently contradicts the standard behaviour"... that would make a
    *huge* difference.

    That aside.

    I'm just curious about the \0 thing. What problems would having a \0 in
    UTF-8 present, that are not presented by having \0 in ASCII? I can't
    see any advantage there.

    The only advantage I can imagine, would be using UTF-8 for storing \0
    in places that previously weren't possible. To me, that sounds like a
    strange way to add a feature.

    On 12 Nov 2004, at 23:58, A. Vine wrote:

    > FYI, we are well aware of this shortcoming (modified UTF-8), and with
    > each release try to mitigate it even further. The problem is that it
    > is so deep in the code (note that it is since Java 1.0) that it is not
    > easy to eliminate without breaking a lot of existing stuff, something
    > that the Java team strive to avoid.
    > Theodore H. Smith wrote:
    >> DataInput.html#modified-utf-8
    >> If only people could sue for suggesting bad coding practices ;o)
    >> --
    >> Theodore H. Smith - Software Developer.

    This archive was generated by hypermail 2.1.5 : Fri Nov 12 2004 - 18:11:51 CST