Re: Back to the subject: Folding algorithm and canonical equivalence

From: Asmus Freytag (
Date: Mon Jul 19 2004 - 17:23:49 CDT

  • Next message: Peter Kirk: "Re: Back to the subject: Folding algorithm and canonical equivalence"

    At 01:56 PM 7/19/2004, Mark Davis wrote:
    >You did point out an oversight; Asmus and I have been working on the issue.

    As Mark wrote, your point is taken and we've taken that onboard. However,
    we won't try to *edit* text on the list, that's why we are not engaging in
    a long discussion on the details (and we've discovered many interesting
    ones, wait for the next version of the text).
    In my replies I tend to focus on issues for which I need more information.


    PS: Just one final comment:

    >>Ideally, an implementation would always interpret two
    >>canonical-equivalent character
    >>sequences identically. There are practical circumstances under which
    >>may reasonably distinguish them.
    >Are the authors of UTR #30 claiming that folding is one of those practical
    >circumstances, or is this just an oversight?

    As it turns out, and not surprisingly, realizing that ideal for any
    arbitrary type of possible folding rule can get complicated (again, I won't
    go into details right now). There may be situations were an optimization
    would break canonical equivalence in the face of permissible, but unusual,
    if not to say 'non-sensical' input. That's what's meant with 'practical

    If the ability to 'correctly' handle combining sequences that are a random
    mixture of Khmer and Arabic combining marks were to result in severe
    runtime penalties, would you rather have a 'correct' or a fast implementation?

    Nobody argues that sequences that are expected to occur in realistic data,
    including specialized texts, definitely should be handled as expected, even
    where practicalities require some optimizations.

    So, we are all agred.

    This archive was generated by hypermail 2.1.5 : Mon Jul 19 2004 - 17:25:13 CDT