Re: Proposing UTF-21/24

From: Philippe Verdy (
Date: Thu Jan 25 2007 - 21:49:26 CST

  • Next message: Adam Twardoch: "Re: politics of writing Taiwanese"

    From: "Doug Ewell" <>
    > Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
    >> Isn't BOCU-1 much like ISO 2022 which uses some codes to swtich
    >> between multiple small code pages? Will IBM claim that ISO 2022 falls
    >> into its invention, despite what it really does is to use a single
    >> octet to encoded those switches, instead of (possibly) multiple ones
    >> in ISO 2022? Remeber that ISO 2022 contains not only a profile for
    >> 7-bit encoding but also a profile for 8-bit encoding, and with that
    >> last option, most codepage switches become encoded with a single octet
    >> too...
    > BOCU-1 isn't anything like that. A badly oversimplified explanation of
    > BOCU-1 is:

    I already know and have read the algorithm, but...

    > 1. Start with a "base" value.

    Parse "base" here as the start of a codepage, reserve some values for special codes likecontrols.

    > 2. Encode each character as the difference between that character and
    > the base.

    which is completely equivalent to taking the code position in a codepage

    > 3. Encode short differences in fewer bytes, larger differences in more.
    > 4. Move the base after each use to minimize the length of jumps.

    Which is for me exactly equivalent to using ISO2022 jumps...

    > 5. Space and C0 control characters get special handling.

    Like are some codes in ISO2022...

    > 6. Encoded bytes are chosen to be in binary order.
    > If you had said that the use of windows in SCSU was like ISO 2022, that
    > would have made more sense to me.

    I do agree that SCSU is nearer in spirit, but in fact there has been many ISO 2022 implementations before SCSU or BOCU, and if you lookat the various options followed, all were even studied long before IsO 2022 was adopted as a generic framework.

    So for me, the IBM patent licencing on SCSU is protecting little, as very small changes are needed to create something that would be in fact nearer from past works, and could legitimately claimed as works derived from prior works not under the IBM patent claims.

    And because the exact algorithm is patented but licenced for royaltee-free use and distribution, what is the interest of keeping such patent on it? Why IBM does not simply declare that the patent will never be changed back to be payable for its use or redistribution? Keeping the patent only as a security against claimsmade by any concurrent that could acquire it and decide to licence it differently against payments for its use and research?

    This archive was generated by hypermail 2.1.5 : Thu Jan 25 2007 - 21:50:52 CST