Re: [OT] bits and bytes

From: Peter_Constable@sil.org
Date: Fri May 18 2001 - 10:55:43 EDT


On 05/18/2001 09:39:18 AM "Michael \(michka\) Kaplan" wrote:

>Well, most of the various CJK encodings clearly would have a lot more than
9
>bits to them. Kind of required for any system dealing with thousands of
>characters.

But do any of them encode using code units larger than 8 bits? Certainly if
something like GB2312 were encoded in a flat (linear?) encoding that never
used code-unit sequences, the code units would have to be larger than 9
bits. But I've only ever heard of them being handled using sequences of
8-bit code units.

After sending out that second message, I noticed Nelson Beebe had said, "
However, kcc offered extended datatypes to access 6-bit, 7-bit, 8-bit,
9-bit, and 36-bit characters." Does that qualify for both the largest and
the smallest code units to represent characters?

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT