From: Peter Kirk (firstname.lastname@example.org)
Date: Mon Nov 15 2004 - 06:09:06 CST
On 15/11/2004 05:48, Doug Ewell wrote:
>Peter Kirk <peterkirk at qaya dot org> wrote:
>>Otherwise what would happen? Would it be acceptable for Java programs
>>to crash, or even throw error messages, if presented with Unicode
>>strings including U+0000?
>Peter, what do you think? Is that what I said? I said it should signal
>the end of the string, as it does in C.
In another message, Doug wrote:
>I'd still like to know what practical, real-world TEXT-related benefits
>would derive from allowing U+0000 in strings of TEXT in a C program.
The practical situation which I have in mind (although not important to
me personally as I do very little programming - I am making this point
more for the general good) is when (hypothetically) I am trying to write
a program in C, or Java, or whatever, to process an arbitrary string of
Unicode characters, perhaps received from the Internet, before handing
them on to a higher level processor. My program works fine until
someone, for whatever (possibly malicious) reason, sends a string
containing U+0000. At that point my program crashes, or does something I
did not intend which may be a security risk. It might well be a security
risk if the task of my program is to scan the string for security
issues, and if none are found it passes on the Unicode string including
U+0000 and what follows it.
What should my program have done? It could have flagged U+0000 as an
illegal character, but it is not; there might be a good reason for it
being in the string, and it is not the business of my program to
interpret such things. If I am going to use string handling at all, I
need to use some kind of escape mechanism to stop this legal U+0000
being misinterpreted. For better or for worse, this Java provides a
mechanism for this situation.
-- Peter Kirk email@example.com (personal) firstname.lastname@example.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Mon Nov 15 2004 - 10:07:30 CST