Re: Zero termination

From: Andrew Lipscomb (ewwa@chattanooga.net)
Date: Mon Jun 29 2009 - 09:46:03 CDT

  • Next message: John H. Jenkins: "Re: Zero termination"

    >> Hi grp,
    > I just want to know if a valid UTF16 string can contain the value
    > zero(0),
    > not the character zero but the 16bit value zero.
    > Like, if i iterate through each unicode character(16 bits), will i
    > find zero
    > at any time? Is Zero a valid code point or a part of a code point?
    > Basically can i use zero to represent termination of a U16 string?
    > because
    > if zero is in the middle of str, then the program will terminate
    > in wrong
    > place.

    Unicode is not directly concerned with the validity of strings as
    such. A single byte containing the hex value 00 will occur
    frequently within typical UTF-16 text (most notably for all
    characters in ISO-Latin-1, and for two common characters--the
    space and the ideogram for "one"--in Chinese). However, a string
    terminated by a pair of 00 bytes (aligned to the character
    boundary), in other words a NULL represented in its UTF-16 format,
    is not unheard of as a terminator--Windows, at least, uses that
    for its "wide character" string format.



    This archive was generated by hypermail 2.1.5 : Mon Jun 29 2009 - 09:50:26 CDT