Re: Unicode support

From: Neil Harris (neil@tonal.clara.co.uk)
Date: Wed Jul 27 2005 - 10:54:43 CDT

Next message: Asmus Freytag: "Re: Unicode support"

Previous message: Tunga, Prasad: "Unicode support"
In reply to: Tunga, Prasad: "Unicode support"
Next in thread: Asmus Freytag: "Re: Unicode support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Tunga, Prasad wrote:

>I have an application (written in 'C') which currently reads and manipulates ASCII strings. However I would like to it convert it so that it can read Unicode strings.
>What are the basic things I should be looking at to make it compatible with Unicode..?
>
>
>
>This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
>
>
>
>
>
>
>
There's a difference between Unicode, and transformation formats of
Unicode. For many purposes, just using UTF-8 and treating the text in
the traditional way works just fine. However, if you want to manipulate
Unicode characters directly, you will need to use arrays of "wide"
characters, and have codecs which translate to/from concrete Unicode
representations and sequences of Unicode code points (which are what you
store in the "wide" character data type). You will need to know the
encodings used/expected on input and output to choose the correct codecs.

See
http://www.gnu.org/software/libc/manual/html_node/Extended-Char-Intro.html
for some more infomation on wide characters in C. 4 byte wide characters
are best, because that way you can have a true 1:1 relationship between
all possible Unicode code points, even those outside the Basic
Multilingual Plane, and the values stored in the wide characters.

GNU libiconv http://www.gnu.org/software/libiconv/ is helpful, too.

-- Neil

Next message: Asmus Freytag: "Re: Unicode support"
Previous message: Tunga, Prasad: "Unicode support"
In reply to: Tunga, Prasad: "Unicode support"
Next in thread: Asmus Freytag: "Re: Unicode support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jul 27 2005 - 10:57:28 CDT