Re: Factor implements 24-bit string type for Unicode support

From: Hans Aberg (haberg@math.su.se)
Date: Mon Feb 04 2008 - 12:22:12 CST

Next message: Philippe Verdy: "RE: Factor implements 24-bit string type for Unicode support"

Previous message: Markus Scherer: "Re: Factor implements 24-bit string type for Unicode support"
In reply to: Markus Scherer: "Re: Factor implements 24-bit string type for Unicode support"
Next in thread: Philippe Verdy: "RE: Factor implements 24-bit string type for Unicode support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 4 Feb 2008, at 18:47, Markus Scherer wrote:

> Most Unicode software and libraries use UTF-16 internally, which is
> easy to use.

It may then have a legacy from the days one thought two bytes would
be enough. - It is common in computers to keep outdated form just for
backwards compatibility, even long time they have fallen out of use.

> Some use UTF-8 even internally, if they see a large majority of
> high-volume text in ASCII.

Sure, for programs that essentially processes bytes. I made a regular
expression process, so that lexers like Flex need not be rewritten -
they essentially just process byte patterns, anyway.

> UTF-32 as a string encoding is rare. (Some people call single-code
> point integers "in UTF-32".)

This would be for libraries that cannot handle variable size
charters. C++ maybe(?).

Hans Åberg

Next message: Philippe Verdy: "RE: Factor implements 24-bit string type for Unicode support"
Previous message: Markus Scherer: "Re: Factor implements 24-bit string type for Unicode support"
In reply to: Markus Scherer: "Re: Factor implements 24-bit string type for Unicode support"
Next in thread: Philippe Verdy: "RE: Factor implements 24-bit string type for Unicode support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Feb 04 2008 - 12:26:00 CST