RE: UTN #31 and direct compression of code points

From: Philippe Verdy (
Date: Tue May 08 2007 - 14:45:04 CDT

  • Next message: Adam Twardoch: "Re: swastika"

    Doug Ewell wrote:
    > Envoy: mardi 8 mai 2007 08:26
    > : Unicode Mailing List
    > Cc: Richard Wordingham
    > Objet: Re: UTN #31 and direct compression of code points
    > Richard Wordingham <richard dot wordingham at ntlworld dot com> wrote:
    > >> On a large alphabet like Unicode, this conversion table will have a
    > >> very significant size,...
    > >
    > > That entirely depends on how one stores the table. One need only
    > > store the entries for the characters that occur in the text.
    > That is exactly the point I've been trying to make about the supposed
    > "large alphabet" effect. This e-mail contains no Cyrillic characters,
    > and a Unicode-based Huffman encoding of it would not need to allocate
    > space for Cyrillic characters, even though there are hundreds of
    > Cyrillic characters in Unicode.

    Side note: do you know if the nick-named "arithmetic coding" (that optimizes
    a bit further the compression using principles similar to Huffmann coding,
    but with a better approximation of the entropy reduction, and that also
    needs similar tables for its statistic decision tree) is still challenged by
    the IBM patents on it?

    I say that, because some people have demonstrated that they were able to
    produce completely equivalent results, based only on a prior art document,
    using another analogy; the equivalence is now demonstrated in the
    mathematical sense, even though the definition is based on different
    background concepts (i.e. there exists a bijection between the two models
    implied by the two conceptual definitions).

    So, many free open-sourced implementations of some audio/video codecs (for
    example in JPEG image decoders) are now citing this prior art document in
    their documentation instead of the IBM patent, and just say that the codec
    is "compatible" with the JPEG standard, instead of claiming that they
    implement it in a compliant way, even if this does not make any difference
    and these applications effectively comply to the standard if you test them.

    This archive was generated by hypermail 2.1.5 : Tue May 08 2007 - 14:46:26 CDT