Re: Unicode lexer

From: Hans Aberg (haberg@math.su.se)
Date: Wed Apr 20 2005 - 18:49:11 CST

  • Next message: Hans Aberg: "Re: Unicode lexer"

    At 16:50 -0700 2005/04/20, Tex Texin wrote:
    > > More advanced Unicode support might involve support for recognizing
    >> common Unicode character classes. For example, one might want to
    >> recognize letters, so that one can easily admit identifiers using
    >> letters.
    >> --
    >> Hans Aberg
    >
    >We would want to make use of the character classes and in general follow
    >UAX 31.
    >
    >Anyone have experience good or bad with the UAX 31 model?

    With the method I indicated, one breaks down the Unicode character
    class into a series of intervals. The problem is really how to
    automate it, as doing it by hand probably is tedious. You might, in
    the end, get an awesome regular expression; I do not know if there
    are limits to that in a program like Flex.

    -- 
       Hans Aberg
    


    This archive was generated by hypermail 2.1.5 : Wed Apr 20 2005 - 18:55:50 CST