RE: Compression through normalization

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Dec 06 2003 - 17:29:42 EST

  • Next message: Philippe Verdy: "RE: Transcoding Tamil in the presence of markup"

    Mark Davis writes:
    > > OK, then I suppose I should play devil's advocate and ask Peter's and
    > > Philippe's question again: If C10 only restricts the modifications to
    > > "canonically equivalent sequences," why should there be an additional
    > > restriction that further limits them to NFC or NFD? Or, put another
    > > way, shouldn't such a restriction be part of C10, if it is important?
    >
    > C10 is a conformance clause; outputting NFC is a best-practice
    > recommendation, not a requirement, and does not belong in C10.

    Simple and effective response, which makes sense. Yes compressors may
    denormalize strings as they want, and decompressors are then not required to
    recompose and reorder in NFC form

    (in fact I think they should do the strict minimum with a very literal
    decompression of their input stream, and produce a predictable output from a
    predefined compressed stream, something which is not needed for compressors,
    with the added bonus that the predictable result of the decompression can be
    checked with binary signatures, even if later a process using this result
    ever needs to normalize it).

    If that final process needs a NFC form or an original form the compressor
    may be designed with an optional flag that preserve normalization forms at
    the price of compression. Such optional flag is not needed if the result of
    the compression is directly encoding a NFC form.

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Sat Dec 06 2003 - 18:13:41 EST