RE: FYI: Regex paper for UTC

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Oct 13 2007 - 14:36:53 CDT

Next message: Hans Aberg: "Re: FYI: Regex paper for UTC"

Previous message: Rick McGowan: "Re: Use of interum PUA encodings for 85 letters"
In reply to: Hans Aberg: "Re: FYI: Regex paper for UTC"
Next in thread: Hans Aberg: "Re: FYI: Regex paper for UTC"
Reply: Hans Aberg: "Re: FYI: Regex paper for UTC"
Reply: Hans Aberg: "Re: FYI: Regex paper for UTC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hans Aberg [mailto:haberg@math.su.se]wrote:
> That operation is not very useful, because the language complement of
> say a single character c is the set of all other strings. So if one
> is finding the longest string in the language from a point on, and
> the string isn't c, all will be eaten.

No, such operation is typically used in association with a "&&" operator
that restricts the set of matchable strings. They are used also for matching
left and right contexts without including these contexts in the returned
match.

But as you said, "all will be eaten" ONLY IF "the string is not c", so the
effect of negation is NOT producing the whole set of possible texts.

Even the negation of the empty string is not the whole set of possible
texts, but ONLY the set of NON-EMPTY strings. It has a very well-defined
meaning in set algebras.

The fact that the result set of an negated RE is infinite is not a problem
conceptually: in fact, even the RE "r*" is also matching an infinite set,
and despite this, it is productive, because we can build an engine to match
it using a finite number of tests (this number of tests and branches to
implement is also bounded, and only depends on the length of the regular
expression).

Really I should try to convince you by making a reusable and simplified
version of my regexp matcher that supports negations of any regexps
(including those with capturing groups and left or right contextual matches)

Next message: Hans Aberg: "Re: FYI: Regex paper for UTC"
Previous message: Rick McGowan: "Re: Use of interum PUA encodings for 85 letters"
In reply to: Hans Aberg: "Re: FYI: Regex paper for UTC"
Next in thread: Hans Aberg: "Re: FYI: Regex paper for UTC"
Reply: Hans Aberg: "Re: FYI: Regex paper for UTC"
Reply: Hans Aberg: "Re: FYI: Regex paper for UTC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Oct 13 2007 - 14:38:13 CDT