Re: UTF-16 inside UTF-8

Date: Wed Nov 05 2003 - 13:13:43 EST

  • Next message: Addison Phillips [wM]: "RE: Ill-formed sequences (was: Re: UTF-16 inside UTF-8)"

    In a message dated 11/4/2003 9:57:40 PM Pacific Standard Time, writes:
    Peter Kirk <peterkirk at qaya dot org> wrote:

    >> ... (a very old, legacy application, unaware of the existence of
    >> codepoints above U+FFFF) ...
    > Such applications are not "very old", they are still being written.
    I agree with you even new software could only support just BMP. But that only
    mean we need to bring those software forward, instead bring the specification

    > For example (see,
    > MySQL 4.1 adds UCS-2 and UTF-8 support to previous versions but for
    > single two-byte codes in UCS-2 and up to three bytes per UTF-8
    > character only :-( - and this is still in alpha!
    Then, it is a very good news for you! Log a bug against it and request it be
    a beta stopper. And that is probably exactly why "this is still in alpha".
    Anyone want to look at their code and submit a patch? Anyone can point out where
    is the current code which did that ?

    Buggy software are created everyday.
    At the risk of upsetting the open-source faithful, that is just plain
    I don't think you shoudl call it "lazy". It is just "under construction" if
    such software is still in "alpha". How many software have such support in their
    "Alpha" stage in your company ?
    Anyone who can master the wizardly details of building a powerful
    (and commercially successful) database program can figure out how to
    slap two surrogates together without destroying performance.
    Constraining UTF-8 to the BMP is even less defensible, since there is no
    performance penalty in allowing four-byte UTF-8 sequences.

    Frank Yung-Fong Tang
    System Architect, Itrntinl Dvlpmet, AOL Intrtv Srvies
    AIM:yungfongta Tel:650-937-2913
    Yahoo! Msg: frankyungfongtan

    John 3:16 "For God so loved the world that he gave his one and only Son, that
    whoever believes in him shall not perish but have eternal life.

    Does your software display Thai language text correctly for Thailand users?
    -> Basic Conceptof Thai Language linked from Frank Tang's
    Itrntinliztin Secrets
    Want to translate your English text to something Thailand users can
    understand ?
    -> Try English-to-Thai machine translation at

    This archive was generated by hypermail 2.1.5 : Wed Nov 05 2003 - 14:01:50 EST