From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Oct 22 2007 - 17:31:36 CDT
Hans Aberg wrote:
> Envoy: lundi 22 octobre 2007 22:24
> : verdy_p@wanadoo.fr
> Cc: Unicode List
> Objet: Re: FYI: Regex paper for UTC
>
> On 22 Oct 2007, at 22:16, Philippe Verdy wrote:
>
> > Note that L may contain strings containing strings like a base
> > letter followed by a diacritic, which is canonically equivalent to
> > its precomposed form. Would only the precomposed form would be
> > allowed in [L] ? The definition of "length" is not precise enough.
> > Forme the composed nas precomposed letters should behave
> > identically, ans so their "length" should be 1 in both case. If so,
> > then [L] will contain BOTH the precomposed letter and the sequence
> > of a letter and a diacritic.
>
> Read all the stuff. There are different constructions.
>
> The main point is that the operations you seek are restrictions of
> the language set operations.
No, I don't make any restriction. Your proposal is making restrictions. I
can support your style at the same time as the legacy POSIX rules, both on
the same regexp and without ambiguity.
This archive was generated by hypermail 2.1.5 : Mon Oct 22 2007 - 17:35:11 CDT