Re: Zero termination

From: Venugopalan G (venunet@gmail.com)
Date: Sat Jun 27 2009 - 12:28:59 CDT

  • Next message: Michael D'Errico: "Re: Zero termination"

    Thank you very much Phil and the grp. Doubt clear.
    Phew!.... am gonna use NULL terminator... XD

    Thanks again,
    Venu

    On Sat, Jun 27, 2009 at 10:51 PM, Phillips, Addison <addison@amazon.com>wrote:

    > Venu,
    >
    >
    >
    >
    > Thanks for the detailed desc.
    > The input is always a readable text from some language(not necessarily
    > English), not an arbitary UTF16 stream.
    > Let me put the question in diff manner.
    >
    > Is it possible that a readable/valid string of any other language has a
    > U+0000 in the middle?
    >
    > AP> No. It doesn’t matter what the language is. The only character in
    > Unicode (and thus UTF-16) that uses the code unit 0x0000 is NULL.
    >
    >
    > I understand that U+0000 is used for representing NULL char. But is it
    > always NULL irrespective of language/charset?
    >
    > AP> Yes. Always.
    >
    >
    >
    > One possibility i cud think of is, e.g. some chinese character might have
    > one code point = two 16b code units,
    >
    > AP> Some Chinese (and other characters from other scripts) in fact do use
    > two 16-bit code units. These are called a “surrogate pair” and are
    > restricted to a specific range of code units which are never null.
    >
    >
    > where 1st 16bit unit is something and the next 16 bit is U+0000. Is that
    > possible?
    >
    > AP> No.
    >
    >
    > Any real world character with such encoding value? Does unicode allow
    > character sets to choose U+0000 for their code point representation?
    >
    > AP> Unicode is the character set. It encodes the various scripts used to
    > write the world’s languages, assigning each character a unique code point.
    > The code point U+0000 is assigned (solely, uniquely) to NULL.
    >
    > Addison
    >
    >
    >
    > Addison Phillips
    >
    > Globalization Architect -- Lab126
    >
    >
    >
    > Internationalization is not a feature.
    >
    > It is an architecture.
    >
    >
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Sat Jun 27 2009 - 12:31:56 CDT