Re: FW: unicode character on Different Unix platforms ....

Date: Tue Nov 02 1999 - 13:11:30 EST

The answer is that wchar_t is not necessarily Unicode, and that Unicode is
not necessarily stored in 16-bit units.

ANSI C defines wchar_t as an abstract type for "wide" characters but does
not specify a concrete type nor a character set for it. On some platforms,
it is Unicode, on others, it is a scalar form of the platform default MBCS.

Unicode, on the other hand, is a character set standard that allows several
The most important ones are UTF-8, UTF-16, and UTF-32. They are stored
using 8-, 16-, or 32-bit integers (unsigned chars, shorts, and ints - or
longs where those are 32b).

Relying on wchar_t to be anything fixed across platforms will not work.


"Magda Danish (Unicode)" <> on 99-11-02 08:42:15

To: "Unicode List" <>
Subject: FW: unicode character on Different Unix platforms ....

-----Original Message-----
From: Shrinivas Kulkarni []
Sent: Tuesday, November 02, 1999 6:14 AM
Subject: unicode character on Different Unix platforms ....

Here is a query on Unicode.
I am building an application, which reads a multi byte character string
text file.
The application converts this MBCS string to unicode string and writes it
a dbf file.
The application has to work on Sun Solaris, HP-UX and AIX .
It works fine on AIX and NT.
I use wchar_t to define a wide char (unicode) string.
On Sun solaris, wchar_t is defined as unsigned long and on HP -UX it is
as unsigned int.

Is my basic assumption that unicode character is of 2 bytes wide right ?
If so
then how is that
different OSs define their own definition of wchar_t ?

waiting for your reply.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT