Re: Unicode & space in programming & l10n

From: Steve Summit (scs@eskimo.com)
Date: Wed Sep 20 2006 - 21:15:16 CDT

  • Next message: William J Poser: "support for full unicode"

    William Poser wrote:
    > I'm confused as to the sense in which C and C++
    > "don't support the Unicode character model". It is
    > very easy to manipulate objects of type wchar_t,
    > arrays thereof, linked lists thereof, and so forth.

    Indeed (or, as others have pointed out, to manipulate objects
    of type int16_t or int32_t if you want that extra degree of
    explicitness).

    What Standard C doesn't give you (I don't know as much about C++)
    is the full-featured set of Unicode-compatible library routines
    you might expect to have provided for you up-front. Yes, there
    are wcstomb and mbtowcs, but you can't be sure they convert to
    and from UTF-8. Yes, there are iswupper and towlower and the
    others in <wctype.h>, but you can't be sure they'll exactly
    implement the relevant Unicode character classes. And so on.

    Of course, you can always either roll your own routines, or use a
    third-party library like ICU, so C's lack of "built-in" support
    may not be a serious problem for you in practice. (Or it might be.)



    This archive was generated by hypermail 2.1.5 : Wed Sep 20 2006 - 21:16:27 CDT