From: John Cowan (cowan@ccil.org)
Date: Sat Nov 13 2004 - 11:13:45 CST

    Theodore H. Smith scripsit:

    > I'm just curious about the \0 thing. What problems would having a \0 in
    > UTF-8 present, that are not presented by having \0 in ASCII? I can't
    > see any advantage there.

    AFAICT it was a hack so that arbitrary Java strings could be encoded
    as C strings; that is, with no 0x00 bytes in them, even when the
    string contained a U+0000. This is the format used in Java class
    files for string constants as well.

    The important thing is to note that the readUTF and writeUTF methods are
    *binary* I/O; they are the standard way of serializing strings,
    just as the standard way of serializing ints is to write them out
    as a 4-byte big-endian sequence.

    They simply have nothing to do with character encoding at all.

