Re: Another Querry

From: Doug Ewell (dewell@adelphia.net)
Date: Tue Nov 23 2004 - 23:49:13 CST

Next message: Antoine Leca: "Re: Another Querry"

Previous message: John Cowan: "Re: My Querry"
In reply to: Harshal Trivedi: "Another Querry"
Next in thread: Antoine Leca: "Re: Another Querry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Harshal Trivedi <harshal dot trivedi at gmail dot com> wrote:

> How can i determine end of UCS-2/UCS-4 string while encoding it in C
> program?
> Normal C string ends with '\0' - ASCII NULL as terminating
> character.What symbol,pattern or character in UCS-2 or UCS-4
> substitutes that ASCII NULL as termination symbol.

You wouldn't normally use the ordinary C string type to encode a UTF-16
(not UCS-2, please) or UCS-4 string. They're not meant for that, for
exactly the reason your question implies: incidental zero-bytes will
cause premature termination of the string, because almost all C
implementations assume an 8-bit encoding.

The solution is either to use UTF-8, or use "wide character" strings
based on 16-bit (or, less likely, 32-bit) "character" units.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/

Next message: Antoine Leca: "Re: Another Querry"
Previous message: John Cowan: "Re: My Querry"
In reply to: Harshal Trivedi: "Another Querry"
Next in thread: Antoine Leca: "Re: Another Querry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Nov 23 2004 - 23:51:58 CST