Re: New Public Review Issue: Proposed Update UTS #18

From: Andy Heninger (andy.heninger@gmail.com)
Date: Tue Sep 25 2007 - 20:36:48 CDT

  • Next message: Philippe Verdy: "RE: Needs help identifying script"

    Just to add my perspective on this thread,

    All of the discussions and possibilities for language aware Unicode regular
    expression processing are fascinating, but they aren't implemented anywhere
    (at least not publicly) yet, let alone being mature enough to push up as
    some sort of a formal recommendation. POSIX regular expressions don't count
    - they're not Unicode, and the locale sensitive features are pretty much a
    failure (which we should take as a caution.)

    What is very important to have right in UTS-18 are the things that
    implementors of conventional, mainstream regular expressions really should
    be doing for Unicode support. Recommendations for the Perls, PCREs and
    Javas of the world, and they do pay at least some attention to what UTS-18
    says.

    The changes to the descriptions of new-line handling, or fixing the problems
    that Mike pointed out with the description of multi-line mode, these
    _really_ need to be perfect.

    The more advanced stuff needs implementation experience to get a better
    handle on what works in practice. And implementors should have a free hand
    to try things, without feeling too constrained by a UTS.

      -- Andy



    This archive was generated by hypermail 2.1.5 : Tue Sep 25 2007 - 20:38:28 CDT