Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )

From: Stephane Bortzmeyer (
Date: Fri Jun 02 2006 - 09:03:56 CDT

  • Next message: Otto Stolz: "Glyphs for German quotation marks"

    On Fri, Jun 02, 2006 at 01:04:50PM +0100,
     Theodore H. Smith <> wrote
     a message of 66 lines which said:

    > I don't think the argument that we can waste 4x RAM and disk size is
    > a good one. Why buy 4 hard disks when one will do?

    Show me someone who can fill a modern hard disk with only raw text
    (Unicode is just that, raw text) encoded in UTF-32. Even UTF-256 would
    not do it.

    > That RAM has to come from somewhere you know.

    RAM is not occupied by raw text that you manipulate. It is occupied by
    everything else but the text you edit. I just launched OpenOffice on a
    5k characters file and it uses 130 Mbytes of memory: 26000 bytes per
    character! The difference between UTF-8 and UTF-32 is simply
    ridiculous, in terms of space.

    This archive was generated by hypermail 2.1.5 : Fri Jun 02 2006 - 09:21:59 CDT