Re: Getting A Newb Started

From: John H. Jenkins ([email protected])
Date: Mon Jul 07 2008 - 18:18:17 CDT

Next message: William J Poser: "Re: Getting A Newb Started"

Previous message: William J Poser: "Re: Getting A Newb Started"
In reply to: William J Poser: "Re: Getting A Newb Started"
Next in thread: William J Poser: "Re: Getting A Newb Started"
Reply: William J Poser: "Re: Getting A Newb Started"
Reply: Doug Ewell: "Re: Getting A Newb Started"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Jul 7, 2008, at 3:19 PM, William J Poser wrote:

> There's no way to avoid using more than one byte per character if
> you're using Unicode since there are more than 256 characters. If
> you use UTF-32, every char is four bytes. If you use UTF-8, characters
> take from one to four bytes depending on where the corresponding
> codepoint
> is. If you use UTF-16, every character in the BMP is two bytes, any
> character
> outside of the BMP takes four bytes.
>

This isn't as much of an advantage as it sounds, since in most Unicode
processes you need to be prepared to deal with multiple characters at
once anyway.

> The downside of UTF-16 and UTF-8 is that characters are not the same
> length, which makes processing more complicated. With UTF-16, however,
> if you know that there are no characters outside the BMP, every
> character is a constant two bytes wide.
>

That's the problem. You really can't make the assumption that you're
dealing with BMP-only text.

=====
John H. Jenkins
[email protected]

Next message: William J Poser: "Re: Getting A Newb Started"
Previous message: William J Poser: "Re: Getting A Newb Started"
In reply to: William J Poser: "Re: Getting A Newb Started"
Next in thread: William J Poser: "Re: Getting A Newb Started"
Reply: William J Poser: "Re: Getting A Newb Started"
Reply: Doug Ewell: "Re: Getting A Newb Started"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Jul 07 2008 - 18:21:45 CDT