Re: U+0000 in C strings

From: Peter Kirk (peterkirk@qaya.org)
Date: Mon Nov 15 2004 - 06:09:06 CST

  • Next message: Doug Ewell: "Re: U+0000 in C strings (was: Re: Opinions on this Java URL?)"

    On 15/11/2004 05:48, Doug Ewell wrote:

    > ...
    >
    >Peter Kirk <peterkirk at qaya dot org> wrote:
    >
    > ...
    >
    >>Otherwise what would happen? Would it be acceptable for Java programs
    >>to crash, or even throw error messages, if presented with Unicode
    >>strings including U+0000?
    >>
    >>
    >
    >Peter, what do you think? Is that what I said? I said it should signal
    >the end of the string, as it does in C.
    >
    >

    In another message, Doug wrote:

    >I'd still like to know what practical, real-world TEXT-related benefits
    >would derive from allowing U+0000 in strings of TEXT in a C program.
    >
    >
    >
    The practical situation which I have in mind (although not important to
    me personally as I do very little programming - I am making this point
    more for the general good) is when (hypothetically) I am trying to write
    a program in C, or Java, or whatever, to process an arbitrary string of
    Unicode characters, perhaps received from the Internet, before handing
    them on to a higher level processor. My program works fine until
    someone, for whatever (possibly malicious) reason, sends a string
    containing U+0000. At that point my program crashes, or does something I
    did not intend which may be a security risk. It might well be a security
    risk if the task of my program is to scan the string for security
    issues, and if none are found it passes on the Unicode string including
    U+0000 and what follows it.

    What should my program have done? It could have flagged U+0000 as an
    illegal character, but it is not; there might be a good reason for it
    being in the string, and it is not the business of my program to
    interpret such things. If I am going to use string handling at all, I
    need to use some kind of escape mechanism to stop this legal U+0000
    being misinterpreted. For better or for worse, this Java provides a
    mechanism for this situation.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Mon Nov 15 2004 - 10:07:30 CST