Re: Unicode lexer

From: Tex Texin (tex@i18nguy.com)
Date: Wed Apr 20 2005 - 17:44:26 CST

  • Next message: Tex Texin: "Re: Unicode lexer"

    thanks Frank, generally yes to all.
    Speed and not too resource intensive would of course be on the list too.

    Frank Yung-Fong Tang wrote:
    >
    > I think one question we need to first answer is how do you define an
    >
    > Unicode Enabled Lexer
    >
    > I don't have a good answer. But I think it should at least include the
    > following
    >
    > 1. Have the ability to scane UTF-8 (and/or UTF-16) input file
    > 2. Have the ability to return token in one or more transformation
    > format of Unicode
    > 3. Have the ability to handle some set of Unicode regular expression
    > features
    > 4. Have the ability to support programming language specific Unicode
    > 'escape' sequence. ( \uHHHH, &#ddddd; &#xxxxx; \HHHHH , etc) The
    > lexer may not support it directly, but it should be able to let the
    > Lexer caller to define a way to deal with it.
    > 5. Use some Unicode based String data type as primitive datatype to
    > return the result in the token.[?]
    >
    >
    > --
    > Frank Yung-Fong Tang
    > 譚永鋒
    > Šýšţém Årçĥîţéçţ

    -- 
    -------------------------------------------------------------
    Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
    Xen Master                          http://www.i18nGuy.com
                             
    XenCraft		            http://www.XenCraft.com
    Making e-Business Work Around the World
    -------------------------------------------------------------
    


    This archive was generated by hypermail 2.1.5 : Wed Apr 20 2005 - 17:45:02 CST