UTF-8 vs UTF-16 as processing code

From: Erik van der Poel (erik@netscape.com)
Date: Fri Jun 16 2000 - 13:35:00 EDT

Hi everybody,

I'm wondering if there are any analyses comparing UTF-8 with UTF-16 for
use as a processing code. UCS-2 has often been considered a good
representation to use internally inside a program because of its "fixed
width" properties (assuming that you can somehow deal with combining
marks, etc), but UTF-16 clearly isn't fixed width, especially now that
Unicode and 10646 are about to actually assign characters beyond U+FFFF.

The kind of analysis I have in mind is one that lists various pros and
cons for each representation. I had a quick look at the Unicode 3.0
book, but I haven't read all of it yet. Does anybody have any pointers
to such analyses, e.g. URLs, books, etc?



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT