RE: UTF8 vs. Unicode (UTF16) in code

From: Peter_Constable@sil.org
Date: Fri Mar 09 2001 - 14:41:37 EST

Next message: Michael \(michka\) Kaplan: "Re: Unicode market acceptance"
Previous message: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe in reply to: Allan Chau: "UTF8 vs. Unicode (UTF16) in code"
Next in thread: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 03/09/2001 12:53:57 PM "Ayers, Mike" wrote:

> Um... no. The UTF-32 CES can handle much more than the current
>space of the Unicode CCS. As far as I can tell, it's good to go until we
>need more than 32 bits to represent the ACR. I'm actually surprised that
>this comment was so misunderstood. Ah, well...

Strictly speaking, I'm afraid you're wrong. The UTF-32 encoding form is
defined in UTR#19 which clearly states

<quote>
UTF-32 is restricted in values to the range 0..10FFFF(subscript: 16)
</quote>

Unsigned 32-bit integers can directly represent 4G characters; UTF-32 can
accommodate much much less.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>

Next message: Michael \(michka\) Kaplan: "Re: Unicode market acceptance"
Previous message: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe in reply to: Allan Chau: "UTF8 vs. Unicode (UTF16) in code"
Next in thread: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT