Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )

From: Mike Ayers (mayers@celequest.com)
Date: Fri Jun 02 2006 - 17:38:24 CDT

  • Next message: Kenneth Whistler: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"

    Theodore H. Smith wrote:

    > Moores law doesn't mean we should be more wasteful.

            Especially since it's a marketing myth.

    > If you get a computer 4x as fast, instead of using UTF-32 of UTF-8, you
    > could maybe make 4x the money by having 4x the throughput.

            That was my point. However, it's not 4x, since UTF-8 has some overhead
    involved in encode/decode.

    > I think by the time we are citing Moore's law, this isn't really a
    > Unicode discussion, but a computing in general discussion...

            Are you saying that discussions of Unicode should ignore the
    environment in which they will run? Isn't that part of the reason for
    all these encodings in the first place? Yes, yes it is...

    > My original point was that UTF-8 can be used for more than it is given
    > credit for. You can do lowercasing, uppercasing, normalisation, and
    > just about anything, on UTF-8, without corruption or mistakes, and do
    > it CPU efficiently and far more space efficiently.

            UTF-8 is given plenty of credit, as it is the predominant encoding.
    Nevertheless, you manage to oversell it.

    > And the other point is that a character (aka unicode glyph) is a
    > string. So whatever you do, you'll need to be be string processing,
    > treating each character as a variable length unit, so it might as well
    > be a variable 8-bit length unit than 32bit...

            This is just wrong. For starters, UTF-32 is not variable length.

    > Therefor, I win the discussion. Thank you :)

            Damn elven judge!

    /|/|ike



    This archive was generated by hypermail 2.1.5 : Fri Jun 02 2006 - 17:49:42 CDT