RE: 32'nd bit & UTF-8

From: Jon Hanna (jon@hackcraft.net)
Date: Tue Jan 18 2005 - 14:25:39 CST

  • Next message: Jon Hanna: "RE: 32'nd bit & UTF-8"

    > Under C/C++, one will use a wchar_t which is always of exactly 32-bit,
    > regardless what internal word structure the CPU is using in
    > its memory bus.

    wchar_t can be 7bits in size or more than 128bits.

    > > Not sure if I understand you correctly. What about 00 vs.
    > C0.80, E0.80.80,
    > > FE.80.80.80.80.80.80 etc.?
    >
    > I have added functions that admit creating regular
    > expressions also for the
    > overloaded UTF-BSS ("UTF-8") multibytes. This way, a lexer can provide

    They aren't "overloaded", they are invalid.

    Regards,
    Jon Hanna
    Work: <http://www.selkieweb.com/>
    Play: <http://www.hackcraft.net/>
    Chat: <irc://irc.freenode.net/selkie>



    This archive was generated by hypermail 2.1.5 : Tue Jan 18 2005 - 14:28:25 CST