Re: Data compression

From: Doug Ewell (
Date: Sat May 07 2005 - 18:10:39 CDT

  • Next message: Asmus Freytag: "Re: Data compression"

    Peter Kirk <peterkirk at qaya dot org> wrote:

    >> All text compression schemes must be lossless.
    > I would suppose that a text compression scheme which treated
    > canonically equivalent sequences as identical (and made use of that
    > for slightly improved compression) would be acceptable, although
    > technically (at least at the byte level) not lossless.

    We had an interesting discussion about this on the list while I was
    finishing up UTN #14. (See the section titled "Compression through
    normalization.") It turned out not to be completely obvious whether
    converting the input to a different normalization form constitutes
    "changing" it. There are some reasons why it might be undesirable for a
    compression process to change the exact code points, and for that
    reason, it probably should not be done unless there is a prior agreement
    in place.

    Doug Ewell
    Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sat May 07 2005 - 18:12:44 CDT