# Re: Regular Expressions and Canonical Equivalence

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Fri, 15 May 2015 23:54:22 +0100

On Sat, 16 May 2015 00:31:53 +0200
Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:

> 2015-05-15 23:57 GMT+02:00 Richard Wordingham <
> richard.wordingham_at_ntlworld.com>:
>
> > On Fri, 15 May 2015 22:09:13 +0200
> > Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:
> >
> > > 2015-05-15 9:10 GMT+02:00 Richard Wordingham <
> > > richard.wordingham_at_ntlworld.com>:
> >
> > > This is because you don't understand the issue !
> >
> > > > Now, a program to check whether a trace matching
> > > > {\u0323|\u0302)* matches (\u0323\u0302)* is very simple. It
> > > > just counts the number of times \u0323 occurs and the number of
> > > > times \u0302 occurs, and returns whether they are equal.
> >
> > > This is wrong. \0323\0323\0302\0302 and \0323\0302\0323\0302 would
> > > pass your counting test (which does not work in a FSA) but they
> > > are NOT canonically equivalent because the identical combining
> > > characters are blocking each other (so arbitrary ordering is not
> > > possible).
> >
> > TUS7.0: D108 Reorderable pair:
> > Two adjacent characters A and B in a coded character sequence
> > <A, B> are a Reorderable Pair if and only if ccc(A) > ccc(B) > 0.
> >
> > Now, ccc(U+0302) = 230 > 220 = ccc(U+0323) > 0, so (U+0302, U+0303)
> > is a reorderable pair.
> >
>
> I do NOT contest that U+0323 and U+0302 can reorder, but the fact that
> U+0323 blocks another occurence of U+0323 because it has the **same**
> combining class.

How does that stop <U+0323, U+0323, U+0302, U+0302> and <U+0323,
U+0302, U+0323, U+0302> being canonically equivalent?

TUS7.0: D109 'Canonical Ordering Algorithm' says:
"In a decomposed character sequence D, exchange the positions of the
characters in each Reorderable Pair until the sequence contains no more
Reorderable Pairs."

There is no mention of blocking in D109.

Richard.
Received on Fri May 15 2015 - 17:55:32 CDT

This archive was generated by hypermail 2.2.0 : Fri May 15 2015 - 17:55:32 CDT