From: Hans Aberg (haberg@math.su.se)
Date: Wed Jan 19 2005 - 17:51:30 CST
At 19:35 +0100 2005/01/19, Philippe VERDY wrote:
>> De : "Arcane Jill"
>> As a programmer myself, I actually followed that explanation. But I wonder if
>> it's the right approach. Would it not be a more ... interesting ... approach,
>> to forget Flex, and instead write a brand new Unicode lexer generator which
>> generates a lexer that processes characters (not bytes)?
>
>Why not JFlex, a free GPL-licenced lexer on SourceForge?
>See <http://jflex.de/> for the documentation, download, and access to its
development.
>
>Yes it's not a direct replacement, because it is written in Java for Java, but
>this is still a base to generate lexers that will compile with C++. Also it has
>full Unicode support. The bad thing is its current limitation to 64K DFA states
There is a "Unicode" version of Flex, using a 16-bit wchar_t. This then
results in using 2^16 arrays for lookup tables. So this does not help the
implementation full Unicode range.
> (but this could be patched by changing the internal representation for these
tables)
This table compression is what one would want to avoid. Therefore I started
to think about the regular expression method.
Hans Aberg
This archive was generated by hypermail 2.1.5 : Wed Jan 19 2005 - 17:53:14 CST