Re: UCS-4, UCS-2, UTF-16, UTF-8

From: Mark Davis (markdavis@ispchannel.com)
Date: Thu Feb 17 2000 - 09:40:17 EST


I wrote a paper on this topic, found at http://www.ibm.com/developer/unicode/ (click on Forms of Unicode).

ohmson ohmson wrote:

> Hi Folks,
>
> I have been lurking behind the mailing list for a little
> while and have learnt great stuff from this list. I have visited
> the unicode.org site and read most of the stuff from there
> (we are waiting for the UNICODE 3.0 book to arrive, anyday
> now). I also followed Markus Kuhn's faq on writing unicode-
> enabled applications on UNIX (hence the UTF-8 bias).
>
> Our team has gotten ready to write a client/server
> prototype that is going to be I18N. One of the big
> debates that we get into is whether we should encode
> the data in the database in the various format
> shown in the subject. I started by listing some obvious
> pros and cons and would very much appreciate what you
> folks with the necessary development experience think
> of it. To give it more perspective, we are using C++
> as the programming language.
>
> UCS-4
> pros:
> - no conversion from UNICODE code points to representation,
> easiest for programming
> cons:
> - major storage wastage as only about ~1million code points
> are defined and furthermore, ~65k are of significant interest.
>
> UCS-2
> pros:
> - no conversion from UNICODE code points to representation,
> easiest for programming
> - native to Win NT
> cons:
> - missing out code points beyond the BMP
>
> UTF-16
> pros:
> - all code points are encoded
> - native to Win2000
> - mostly 2 bytes for most natural languages
> cons:
> - need conversion algorithm
>
> UTF-8
> pros:
> - all code points are encoded
> - native to UNIX
> - friendly to sockets programming
> cons:
> - need conversion algorithm
>
> I won't go into the storage of UTF-16/UTF-8 cause i think it
> depends on the language (CJK requires 2 bytes in former but
> 3 bytes in latter).
>
> Thx much, ohmson
>
>
> ____________________________________________________________________
> Get your own FREE, personal Netscape WebMail account today at http://webmail.netscape.com.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT