Re: Proposing UTF-21/24

From: Doug Ewell (
Date: Wed Jan 24 2007 - 08:45:44 CST

  • Next message: Doug Ewell: "Re: programming question"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > Isn't BOCU-1 much like ISO 2022 which uses some codes to swtich
    > between multiple small code pages? Will IBM claim that ISO 2022 falls
    > into its invention, despite what it really does is to use a single
    > octet to encoded those switches, instead of (possibly) multiple ones
    > in ISO 2022? Remeber that ISO 2022 contains not only a profile for
    > 7-bit encoding but also a profile for 8-bit encoding, and with that
    > last option, most codepage switches become encoded with a single octet
    > too...

    BOCU-1 isn't anything like that. A badly oversimplified explanation of
    BOCU-1 is:

    1. Start with a "base" value.
    2. Encode each character as the difference between that character and
    the base.
    3. Encode short differences in fewer bytes, larger differences in more.
    4. Move the base after each use to minimize the length of jumps.
    5. Space and C0 control characters get special handling.
    6. Encoded bytes are chosen to be in binary order.

    If you had said that the use of windows in SCSU was like ISO 2022, that
    would have made more sense to me.

    Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14

    This archive was generated by hypermail 2.1.5 : Wed Jan 24 2007 - 08:48:08 CST