RE: FYI: Regex paper for UTC

From: Philippe Verdy (
Date: Sun Oct 14 2007 - 13:31:23 CDT

  • Next message: Hans Aberg: "Re: FYI: Regex paper for UTC"

    Hans Aberg wrote:
    > I think that the language set operations amiss might be added
    > (intersection and complement) might be added, which can be reduced to
    > ordinary REs. If there are other operators to be defined, they need
    > to be described clearly in a theoretical manner, so one is not left
    > guessing from a few examples.

    I have given a theoretical base for that, you did not understand it. That's
    why I gave some examples, and it's a very usual way to show what we mean
    (many complex theoretical descriptions are followed by examples in TUS, look
    at the UCA specs, nobody would understand clearly what is meant without some
    examples. But I have not based my implementation only to support these few

    I have used these examples to demonstrate that the simple assertion about
    what is a complement is not enough, and wanted to demonstrate that the
    definition is not enough and still allows several interpretations, in other
    words, it is still ambiguous and one of the most important thing that the
    proposal does not speaking about the prioritization of matches by some
    ordering of matches.

    If an implementation must return fewer matches or just one (the "first")
    it's important that it selects the same one as another, or these regexps,
    despite they are written identically, will match differently (not a problem
    if they are implemented for use in distinct and identifiable applications,
    but clearly a problem if those regexps are expected to be interoperable
    across implementations, for example if they are part of the data in the CLDR
    to specify a locale, or if they are part of an automated script that assumes
    that the regexp engines used implicitly to run them are interoperable
    (that's why we find so many regexp variants for ed, sed, vim, Perl, ... that
    also depend on their version, without any way to negociate clearly the
    expected behavior...)

    This archive was generated by hypermail 2.1.5 : Sun Oct 14 2007 - 13:33:29 CDT