RE: FAQ !?

From: Carl W. Brown (
Date: Fri Dec 15 2000 - 11:18:19 EST


Unfortunately if I remember correctly, Sun is one of those that has wchar_t
that is not Unicode.


-----Original Message-----
From: Markus Scherer []
Sent: Wednesday, December 13, 2000 9:17 AM
To: Unicode List
Subject: Re: FAQ !? wrote:
> I guess this should be a FAQ (but is'nt). I need code to convert unicode
> data between
> various encoding schemes (UTF16LE to UTF32BE etc...). Are there standard
> routines
> I can use ? If so, where can I find them ?

The CD for the Unicode book should have some of this - in any case, these
transformations are fairly simple.

Unicode libraries have it, see
For example, see ICU at - see documentation
and source code for converters and UTF macros in

> As an aside. I have run into trouble porting a database application which
> stores UTF16LE
> data onto HPUX and SUN machines. I can see that wchar_t there is defined
> unsigned long.
> So most probably all wcs*() functions would expect UTF32 encoded data. Am
> correct in my
> assumption ? What do I do to be certain ?

wchar_t is a very fuzzy type. It may be 8, 16, or 32 bits depending on the
platform, and there is no general guarantee that it stores Unicode. Most
older systems use it for scalar character code points custom-built for the
char* encoding.

> What online information can I
> look through for
> more information on such a problem ?

About wchar_t and Unicode, see "What size wchar_t do I need for Unicode?" at

To be sure, you can use typedefs that are always what you want. ICU and
other libraries define types for string units and scalar code points that
work on all platforms, and they provide functions to work with such Unicode
strings and characters.

Good luck,

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT