RE: New Public Review Issue: Proposed Update UTS #18

From: Michael Maxwell (mmaxwell@casl.umd.edu)
Date: Tue Oct 02 2007 - 11:59:46 CST

  • Next message: Philippe Verdy: "RE: Proposal for additional syntax (was Re: New Public Review Issue: Proposed Update UTS #18)"

    I hesitate to jump into this thread, but:

    Asmus Freytag wrote:
    > Depending on how many accented letters a language uses,
    > writing the equivalent expression manually can be both
    > tedious and error-prone.

    Aren't there two issues here that need to be separated:

    (1) the issue of what some regex *means*, e.g. what ^X means, where X is some regex.

    (2) the question of how easy it is to enter X on a computer.

    It seems to me at least that there are lots of ways of doing (2), including keying stuff in at the command line, using a GUI like Bill Poser's, and/or having pre-compiled regexs. The latter might be user-defined (as with certain FSTs, like xfst or sfst), or they might be something that comes pre-defined with a regex-using program (like '[:space:]' is for Posix regex's), or they might be pre-compiled for different locales or for Unicode blocks. There might even be a future regex program that could pull the meaning of some constant regex off of a website like we do for XML schemas now.

    I would hate to make the meaning of some regex counter-intuitive just because it's hard to type with today's software.

       Mike Maxwell
       CASL/ U Md



    This archive was generated by hypermail 2.1.5 : Tue Oct 02 2007 - 12:02:06 CST