RE: BOCU-1 spec

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Feb 16 2007 - 14:33:57 CST

  • Next message: John Hudson: "Re: Query for Validity of Thai Sequence"

    De Markus Scherer, le vendredi 16 février 2007 à 19:58:
    > On 2/16/07, Mike <mike-list@pobox.com> wrote:
    > > I see that UTS 40 (BOCU-1) was removed from the website.
    > > I checked UTN 6 and found that it does not contain enough
    > > information to implement the algorithm. Is there a plan
    > > to update UTN 6 with the detail formerly found in UTS 40?

    If I just consider what is found in UTN 6, the text is enough to implement
    BOCU-1, and there's a sample code.
    Please reread:
    http://www.unicode.org/notes/tn6/
    (but this is not the last commented version I have read two weeks ago).

    One thing is now missing: the integer value of the initial state variable!
    It is not noted in the document itself, it should be: prev=0x40.

    But you can find it easily in the header file of the sample C source code:

    /* initial value for "prev": middle of the ASCII range */
    #define BOCU1_ASCII_PREV 0x40

    > Not yet.
    > How urgently do you need it?
    > Do you still have a local copy of the former UTS 40 text?
    > If not, would it suffice for you (for now) to receive such a copy?
    >
    > Out of curiosity: Would you mind sharing what is the intended use of
    > your BOCU-1 implementation?

    If this technical note (which has moved **back** from a "UTS draft" to a
    simpler "technical note", without the yellow-background comments that were
    present two weeks ago, and that I had recently commented here in this list)
    is to be published by Unicode, then you don't need to ask for why the
    implementation is needed. The need is already given in the technical note
    itself, which cites several usages.

    It you think that people must justify their use of the algorithm before
    getting comments about how to implement it, then it is not a free standard,
    and what was already limiting the adoption of BOCU-1 (the current copyright
    and required licence agreement by IBM) is more severe than intended.

    BOCU-1 (UTN #6, i.e. the basic profile) was not removed, only BOCU was (BOCU
    was described in UTS#40 but now belongs back to the ICU project to which IBM
    has licenced its use; the licencing terms are then those visible in the ICU
    project itself).

    But then, how can the IBM patent restriction be compatible with the ICU
    licence (which is a X-based licence):
    http://dev.icu-project.org/cgi-bin/viewcvs.cgi/icu/license.html?view=co

    Quote:
    "Permission is hereby granted, free of charge, to any person obtaining a
    copy of this software and associated documentation files (the "Software"),
    to deal in the Software without restriction, including without limitation
    the rights to use, copy, modify, merge, publish, distribute, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, provided that the above copyright notice(s) and this
    permission notice appear in all copies of the Software and that both the
    above copyright notice(s) and this permission notice appear in supporting
    documentation."

    If we look at these terms, the IBM patent restriction cannot apply, despite
    the reference code that describes BOCU is the one implemented in ICU!
    This ICU licence (signed by IBM itself!), explicitly gives to anyone the
    right to use or derive any implementation of the complete BOCU algorithm
    (and also of its derived basic profile BOCU-1), provided that there's a
    attribution: "Copyright (c) 1995-2006 International Business Machines
    Corporation and others" for parts of the code derived from ICU, such as a
    BOCU implementation. How could then IBM claim later royalty fees about any
    software implementing BOCU or BOCU-1, when it explicitly gave all the rights
    to use it "free of charge"? For me the patent is only there to preserve the
    legitimacy of the IBM copyright, so that no other company can claim fees for
    what IBM provides to the Community for free (it's just a proof of
    authorship).

    The reverting of changes, and the absence of past comments is very
    intrigating. This should have been noted in some history! The simple notice
    that UTS 40 was withdrawn is not enough, because UTN 6 was also reverted.



    This archive was generated by hypermail 2.1.5 : Fri Feb 16 2007 - 14:35:33 CST