Re: Unicode & space in programming & l10n

From: Steve Summit (scs@eskimo.com)
Date: Wed Sep 20 2006 - 21:15:16 CDT

Next message: William J Poser: "support for full unicode"

Previous message: Steve Summit: "Re: Unicode & space in programming & l10n"
In reply to: William J Poser: "Re: Unicode & space in programming & l10n"
Next in thread: Kenneth Whistler: "Re: Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

William Poser wrote:
> I'm confused as to the sense in which C and C++
> "don't support the Unicode character model". It is
> very easy to manipulate objects of type wchar_t,
> arrays thereof, linked lists thereof, and so forth.

Indeed (or, as others have pointed out, to manipulate objects
of type int16_t or int32_t if you want that extra degree of
explicitness).

What Standard C doesn't give you (I don't know as much about C++)
is the full-featured set of Unicode-compatible library routines
you might expect to have provided for you up-front. Yes, there
are wcstomb and mbtowcs, but you can't be sure they convert to
and from UTF-8. Yes, there are iswupper and towlower and the
others in <wctype.h>, but you can't be sure they'll exactly
implement the relevant Unicode character classes. And so on.

Of course, you can always either roll your own routines, or use a
third-party library like ICU, so C's lack of "built-in" support
may not be a serious problem for you in practice. (Or it might be.)

Next message: William J Poser: "support for full unicode"
Previous message: Steve Summit: "Re: Unicode & space in programming & l10n"
In reply to: William J Poser: "Re: Unicode & space in programming & l10n"
Next in thread: Kenneth Whistler: "Re: Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Sep 20 2006 - 21:16:27 CDT