Opinions on this Java URL?

From: John Cowan (cowan@ccil.org)
Date: Sat Nov 13 2004 - 11:13:45 CST

  • Next message: Doug Ewell: "Re: NYT article: Using a New Language in Africa to Save Dying Ones"

    Theodore H. Smith scripsit:

    > I'm just curious about the \0 thing. What problems would having a \0 in
    > UTF-8 present, that are not presented by having \0 in ASCII? I can't
    > see any advantage there.

    AFAICT it was a hack so that arbitrary Java strings could be encoded
    as C strings; that is, with no 0x00 bytes in them, even when the
    string contained a U+0000. This is the format used in Java class
    files for string constants as well.

    The important thing is to note that the readUTF and writeUTF methods are
    *binary* I/O; they are the standard way of serializing strings,
    just as the standard way of serializing ints is to write them out
    as a 4-byte big-endian sequence.

    They simply have nothing to do with character encoding at all.

    He made the Legislature meet at one-horse       John Cowan
    tank-towns out in the alfalfa belt, so that     cowan@ccil.org
    hardly nobody could get there and most of       http://www.reutershealth.com
    the leaders would stay home and let him go      http://www.ccil.org/~cowan
    to work and do things as he pleased.    --Mencken, Declaration of Independence

    This archive was generated by hypermail 2.1.5 : Sat Nov 13 2004 - 11:20:29 CST