Re: What's in a wchar_t string on unix?

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Mar 03 2004 - 17:22:03 EST

  • Next message: Mete Kural: "SVG Fonts - Is it the Font Standard of the future?"

    On 03/03/2004 11:27, Antoine Leca wrote:

    >Frank Yung-Fong Tang va escriure:
    >
    >
    >
    >>Does it also mean wchar_t is 4 bytes if __STDC_ISO_10646__ is defined?
    >>or does it only mean wchar_t hold the character in ISO_10646
    >>(which mean it could be 2 bytes, 4 bytes or more than that?)
    >>
    >>
    >
    >The later. But if wchar_t is 16 bits, it can only encode Unicode 3.0 or
    >before. ie no UTF-16 support.
    >
    >
    >Antoine
    >
    >
    >
    >
    Surely if wchar_t is 16 bits, it CAN be used to encode the whole of
    Unicode with UTF-16, i.e. with supplementary plane characters
    represented as "surrogate pairs" in pairs of wchar_t. Whether these
    characters SHOULD be represented as UTF-16 code units in a wchar_t
    string (or whether representation should be either UCS-2 or UTF-32) is a
    separate issue, probably related to how the associated libraries handle
    the code units for surrogates.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Wed Mar 03 2004 - 17:58:40 EST