Re: New Public Review Issue: Proposed Update UTS #18

From: Mike (mike-list@pobox.com)
Date: Mon Oct 01 2007 - 07:58:32 CST

  • Next message: Philippe Verdy: "RE: New Public Review Issue: Proposed Update UTS #18"

    > But note that with my notation /\q{ch}./ would NOT be equivalent to /ch./
    > - the latter regexp will match only 3 characters: /c/ followed by /h/
    > followed by what /./ matches by default (i.e. [\u0000-\u10FFF] minus the set
    > of line terminators, which depends on the single line or multi-line mode in
    > effect, and that I'll note \R).
    > - the former regexp extends the input universe (matched by ".") by making it
    > [\u0000-\10FFF\q{ch}] (so that it now contains /c/ or /h/ or the sequence
    > /ch/).

    I'll say it again. I think it's a bad idea for \q to have the side
    effect of changing the meaning of ".".

    > For example to match all 3 letters words in Spanish between c and d
    > (inclusive, but "c" and "d" won't match because they are not 3 letters) one
    > would use /(?locale=es:(?range:c:d:...))/

    This seems to be way beyond what I think regular expressions are for.
    Maybe you should create a little text matching language....

    Mike



    This archive was generated by hypermail 2.1.5 : Mon Oct 01 2007 - 08:03:38 CST