Re: Compression through normalization

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Dec 06 2003 - 02:38:36 EST

  • Next message: Don Osborn: "Re: Missing African Latin letters"

    Kenneth Whistler <kenw at sybase dot com> wrote:

    > I don't think either of our recommendations here are specific
    > to compression issues.

    They're not, but compression is what I'm focusing on right now, and your
    recommendations do *apply* to compression.

    > Basically, if a process tinkers around with changing sequences
    > to their canonical equivalents, then it is advisable that
    > the end result actually *be* in one of the normalization
    > forms, either NFD or NFC, and that this be explicitly documented
    > as what the process does. Otherwise, you are just tinkering
    > and leaving the data in an indeterminate (although still
    > canonically equivalent) state.

    OK, then I suppose I should play devil's advocate and ask Peter's and
    Philippe's question again: If C10 only restricts the modifications to
    "canonically equivalent sequences," why should there be an additional
    restriction that further limits them to NFC or NFD? Or, put another
    way, shouldn't such a restriction be part of C10, if it is important?

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Sat Dec 06 2003 - 03:17:32 EST