From: Hans Aberg (firstname.lastname@example.org)
Date: Mon Feb 04 2008 - 12:22:12 CST
On 4 Feb 2008, at 18:47, Markus Scherer wrote:
> Most Unicode software and libraries use UTF-16 internally, which is
> easy to use.
It may then have a legacy from the days one thought two bytes would
be enough. - It is common in computers to keep outdated form just for
backwards compatibility, even long time they have fallen out of use.
> Some use UTF-8 even internally, if they see a large majority of
> high-volume text in ASCII.
Sure, for programs that essentially processes bytes. I made a regular
expression process, so that lexers like Flex need not be rewritten -
they essentially just process byte patterns, anyway.
> UTF-32 as a string encoding is rare. (Some people call single-code
> point integers "in UTF-32".)
This would be for libraries that cannot handle variable size
charters. C++ maybe(?).
This archive was generated by hypermail 2.1.5 : Mon Feb 04 2008 - 12:26:00 CST