Re: Regexes, Canonical Equivalence and Backtracking of Input

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Mon, 18 May 2015 21:46:44 +0100

On Mon, 18 May 2015 22:40:21 +0300
Eli Zaretskii <eliz_at_gnu.org> wrote:

> > Date: Mon, 18 May 2015 19:35:45 +0100
> > From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
> >
> > Mark Davis has published an algorithm to generate all strings
> > canonically equivalent to a Unicode string
>
> Where can I find the description of that algorithm?

Section 5 of http://unicode.org/notes/tn5/ . There's a lot of detail
missing, and its easy to overlook the Hangul sylables. The complete
code is rather more complicated than it looks from the wording,
especially if you want successive candidates on successive calls. You
also need to include the legal permutations of the non-starters - the
code as given only delivers the FCD canonical equivalents.

On further thought, I also think its actually unnecessary for this
application.

Richard.
Received on Mon May 18 2015 - 15:48:12 CDT

This archive was generated by hypermail 2.2.0 : Mon May 18 2015 - 15:48:12 CDT