RE: Compression through normalization

From: Philippe Verdy (
Date: Thu Nov 27 2003 - 17:02:49 EST

    Doug Ewell writes:
    > Peter Kirk <peterkirk at qaya dot org> wrote:
    > > Yes, the compressor can make any canonically equivalent change, not
    > > just composing composition exclusions but reordering combining marks
    > > in different classes. The only flaw I see is that the compressor does
    > > not have to undo these changes on decompression; at least no other
    > > process is allowed to rely on it having done so.
    > I agree with Peter here. I don't think the burden should be on the
    > decompressor to reverse any operation that the compressor performs,
    > except for the compression itself.

    There's possibly a misreading or misunderstanding about what I call
    "undoing" custom normalization. What I mean there is that the
    decompressor can be done to produce a standard NFC or NFD form,
    independantly of the normalization order or composition exclusions or
    non-exclusion performed in the compressor.

    This way, a decompressor can be made compatible with an application
    that expects a particular normalization form. But if we agree that
    any application should accept any canonically equivalent string, it's
    true that this reormalization step in the decompressor is not needed:
    it's then up to the application using the decompressor to choose its
    own prefered normalization on input, from the output of the

