Re: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications

From: Doug Ewell (
Date: Sat Nov 08 2003 - 17:47:53 EST

  • Next message: Mark Davis: "Re: ZWJ, ZWNJ, CGJ and combination"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > OK but you probably did not notice that the Corrigendum #1, published
    > after Unicode 4.0, refers to a backward version of ISO/IEC
    > 10646-1:2000, which is not the version ISO/IEC 10646:2003 refered in
    > Unicode 4.0. This added reference in the Corrigendum (not in the first
    > publication of Unicode 4.0) is then confusive...

    Are we looking at the same Corrigendum?

    The Unicode Character Database for version 4.0.0:
    is dated April 18, 2003.

    This is the same day that Mark Davis sent out an e-mail:
    formally announcing the "final release" of Unicode 4.0.

    Other files in the Unicode database:
    are dated April 16 or 17 (except for Index.txt, Jamo.txt, and Unihan.txt
    and .zip, which are dated March 27).

    This establishes rather conclusively that Unicode 4.0 was officially
    released, or "published," in mid-April 2003. The book, of course,
    didn't become available until some months later.

    Meanwhile, the Corrigendum located at:
    has a "Last updated" date of March 5, 2003 -- and that just refers to
    the last date *any* change was made to this HTML file (spelling
    correction, broken link fixed, stylesheet changed). The actual
    publication date of the Corrigendum could have been well before that.
    But in the best case, this is before the mid-April publication of
    Unicode 4.0.

    And indeed, if you go back far enough, you will see that the Corrigendum
    was originally written against Unicode 3.0.1, sometime between August
    2000 and March 2001:

    The Corrigendum is not listed on the main "Updates and Errata" page at:
    except in the link on the sidebar at the left side of the page. (That
    sidebar also includes links to the Standard Annexes defining Unicode 3.1
    and 3.2, if you want to look at some more history.) The "Updates and
    Errata" page includes a prominent note, at the very top, that all
    updates prior to April 17, 2003 have been incorporated in Unicode 4.0.

    So unless there is a different Corrigendum #1 floating around, it
    appears that this whole premise that the Corrigendum was "published
    after Unicode 4.0," was "not in the first publication of Unicode 4.0,"
    and so on, and thereby raises interoperability problems because ISO/IEC
    10646-1:2000 had not yet been amended to exclude code points beyond
    U-0010FFFF, is completely false and misleading. There is no conflict,
    no confusion, and no controversy.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sat Nov 08 2003 - 18:29:07 EST