Re: Fw: Unicode & space in programming & l10n

From: Philippe Verdy (
Date: Fri Sep 22 2006 - 13:37:26 CDT

  • Next message: Steve Summit: "Re: Fw: Unicode & space in programming & l10n"

    From: "Mike" <>
    >> That's the meaning I give to the question: "Time to deprecate C/C++ ?"
    > This is probably not the forum for language wars. However, I will
    > defend C++ as a viable language for Unicode programming. I have
    > written a library that performs all four forms of normalization,
    > upper/lowercase conversions, collation using the default Unicode
    > collation element table (no tailoring just yet), and conversion
    > between the various UTF's and Windows' little-endian UTF-16.
    > Internally I use an unsigned int to hold code points, avoiding the
    > uncertainty of wchar_t.

    "unsigned int" is not the adequate datatype (wrong semantics, unspecified range, unspecified size, especially vectors if code units, inadequate with the native string constants datatype, and even with the L"..." extended constants). In other words, it's impossible to create portable string constants in C without kludge macros, and impossible to assert the datatype correctness. Of course you may use platform-specific macros, but once you realize you need to port your program, sometimes on the same OS but with a different hardware architecture, the nightmare begins.
    It's not the role of high-level language to make unspecified assumptions about the underlying architectures in its implementation. This is against all good practices of layered programming. The host architecture details should be handled by the OS itself, not in programs (and even most of the services provided with the OS should be layered this way).
    The absence of layered design just brakes against innovation, creativity, fights against scalability and just complicates the deployment. it costs too much in testing, and leaves many "minor" bugs and limitations behind; Those bugs, when cumulated on large scale (like in any modern OS that offers many services) just add to instability, unpredictability, problems to reproduce and solve, and insecurity.
    Yes, I see C/C++ used in the wrong domain (applications), where it should have remained in the area of the development of an OS kernel and its associated hardware drivers, mostly to implement the hardware abstraction layer, on which most of the rest is built using higher-level languages using strong datatypes based on effective data-models.
    As long as C/C++ will persist being used in application programming, there will be resistance to maintain legacy 7/8-bit character encodings.

    This archive was generated by hypermail 2.1.5 : Fri Sep 22 2006 - 13:39:29 CDT