Re: Factor implements 24-bit string type for Unicode support

From: Markus Scherer (
Date: Mon Feb 04 2008 - 11:47:42 CST

  • Next message: Hans Aberg: "Re: Factor implements 24-bit string type for Unicode support"

    Most Unicode software and libraries use UTF-16 internally, which is easy to use.
    Some use UTF-8 even internally, if they see a large majority of
    high-volume text in ASCII.
    UTF-32 as a string encoding is rare. (Some people call single-code
    point integers "in UTF-32".)

    Roll your own encoding form, and you can't use any existing libraries... Why?


    On Feb 4, 2008 5:49 AM, Hans Aberg <> wrote:
    > I think that 32-bit is probably best for internal use in programs for
    > speed, avoiding alignment problems; the best way to actually know is
    > to do some profiling. Externally, for distributed files, UTF-8 seems
    > best, because most agree on how to sort out the bits the bytes.

    Opinions expressed here may not reflect my company's positions unless
    otherwise noted.

    This archive was generated by hypermail 2.1.5 : Mon Feb 04 2008 - 11:51:25 CST