Re: creating a test font w/ CJKV Extension B characters.

From: Doug Ewell (
Date: Fri Nov 21 2003 - 00:02:49 EST

  • Next message: Doug Ewell: "New installable keyboard for UniPad"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > What is a browser supposed to do if it finds an out-of-range GB
    > sequence that is NOT mapped to Unicode? Does GB18030 specify that
    > these sequences are now "invalid" (and permanently assigned to non-
    > characters, like U+FFFF in Unicode), and not "reserved" for future use
    > (like "unassigned" code points in Unicode) ?

    An invalid GB18030 sequence, like <FE 40>, or a valid but out-of-range
    sequence, like <E3 32 9A 36>, should be treated just like an invalid or
    out-of-range UTF-8 sequence. Issue an error message, format the hard
    disk, whatever; just don't try to treat it like a normal character.

    > This is critical, because I could fear that some future relase of
    > GB18030 may assign some functions to these sequences, which will be
    > impossible to map onto Unicode, but only onto ISO/IEC-10646 "extra"
    > planes.

    <sigh />

    There ARE no "extra" planes in ISO/IEC 10646. They will not be used.
    Ever. Forget you ever heard about them.

    There are one hundred thirty-seven THOUSAND private-use code points. If
    the Chinese insist on encoding characters in GB18030 that haven't been
    approved by UTC and WG2, rest assured there will be plenty of room for
    them in the PUA or EPUA.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Fri Nov 21 2003 - 02:40:10 EST