Re: FW: unicode character on Different Unix platforms ....

From: Valeriy E. Ushakov (uwe@ptc.spbu.ru)
Date: Tue Nov 02 1999 - 14:07:55 EST


On Tue, Nov 02, 1999 at 10:10:34AM -0800, schererm@us.ibm.com wrote:

> ANSI C defines wchar_t as an abstract type for "wide" characters but does
> not specify a concrete type nor a character set for it. On some platforms,
> it is Unicode, on others, it is a scalar form of the platform default MBCS.
[...]
> Relying on wchar_t to be anything fixed across platforms will not work.

Moreover, implementations with 1 byte wchar_t are perfectly conformant,
so using wchar_t for Unicode is *definitely* a bad idea.

From ISO C9X draft:

       5.2.4.2.1 Sizes of integer types <limits.h>

       [#1] The values given below shall be replaced by constant
       expressions suitable for use in #if preprocessing
       directives. ................................................
       ........... Their implementation-defined values shall be
       equal or greater in magnitude (absolute value) to those
       shown, with the same sign.

       [...]

          - maximum number of bytes in a multibyte character, for
            any supported locale
            MB_LEN_MAX 1

SY, Uwe

-- 
uwe@ptc.spbu.ru                         |       Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT