Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )

From: Mike Ayers ([email protected])
Date: Fri Jun 02 2006 - 17:38:24 CDT

Next message: Kenneth Whistler: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"

Previous message: Theodore H. Smith: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
In reply to: Theodore H. Smith: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Next in thread: Philippe Verdy: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Reply: Philippe Verdy: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Theodore H. Smith wrote:

> Moores law doesn't mean we should be more wasteful.

Especially since it's a marketing myth.

> If you get a computer 4x as fast, instead of using UTF-32 of UTF-8, you
> could maybe make 4x the money by having 4x the throughput.

That was my point. However, it's not 4x, since UTF-8 has some overhead
involved in encode/decode.

> I think by the time we are citing Moore's law, this isn't really a
> Unicode discussion, but a computing in general discussion...

Are you saying that discussions of Unicode should ignore the
environment in which they will run? Isn't that part of the reason for
all these encodings in the first place? Yes, yes it is...

> My original point was that UTF-8 can be used for more than it is given
> credit for. You can do lowercasing, uppercasing, normalisation, and
> just about anything, on UTF-8, without corruption or mistakes, and do
> it CPU efficiently and far more space efficiently.

UTF-8 is given plenty of credit, as it is the predominant encoding.
Nevertheless, you manage to oversell it.

> And the other point is that a character (aka unicode glyph) is a
> string. So whatever you do, you'll need to be be string processing,
> treating each character as a variable length unit, so it might as well
> be a variable 8-bit length unit than 32bit...

This is just wrong. For starters, UTF-32 is not variable length.

> Therefor, I win the discussion. Thank you :)

Damn elven judge!

/|/|ike

Next message: Kenneth Whistler: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Previous message: Theodore H. Smith: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
In reply to: Theodore H. Smith: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Next in thread: Philippe Verdy: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Reply: Philippe Verdy: "Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jun 02 2006 - 17:49:42 CDT