Re: FYI: Regex paper for UTC

From: Hans Aberg (haberg@math.su.se)
Date: Sat Oct 13 2007 - 14:56:35 CDT

  • Next message: Doug Ewell: "Re: Use of interum PUA encodings for 85 letters"

    On 13 Oct 2007, at 21:36, Philippe Verdy wrote:

    >> That operation is not very useful, because the language complement of
    >> say a single character c is the set of all other strings. So if one
    >> is finding the longest string in the language from a point on, and
    >> the string isn't c, all will be eaten.
    >
    > No, such operation is typically used in association with a "&&"
    > operator
    > that restricts the set of matchable strings. They are used also for
    > matching
    > left and right contexts without including these contexts in the
    > returned
    > match.
    >
    > But as you said, "all will be eaten" ONLY IF "the string is not c",
    > so the
    > effect of negation is NOT producing the whole set of possible texts.

    The problem is that any string starting with c will also be in the
    language complement and matched. This is not what you want: my guess
    is that you only want the strings of length 1 in this case. And
    similar, for a string s of length k, you what the complement to be
    all strings of length k not equal to s. Right? This is not the
    language complement, but the (graded) complement in the subset of all
    strings of length k.

    So the language complement is not what you want.

       Hans Åberg



    This archive was generated by hypermail 2.1.5 : Sat Oct 13 2007 - 14:57:36 CDT