Re: [OT?] QBCS

From: Doug Ewell (dewell@adelphia.net)
Date: Thu Aug 28 2003 - 23:30:03 EDT

Next message: Marco Cimarosti: "RE: [OT?] QBCS"

Previous message: Michael Everson: "Re: Character codes for Egyptian transliteration"
In reply to: Lars Marius Garshol: "Re: [OT?] QBCS"
Next in thread: Marco Cimarosti: "RE: [OT?] QBCS"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Lars Marius Garshol <larsga at garshol dot priv dot no> quoted Marco
Cimarosti:

The original term "DBCS," or "double-byte character set," refers to a
variable-width encoding where each character requires either one or two
bytes. East Asian legacy character encodings fall into this category.

By extension, then, a "QBCS" would be a variable-width character
encoding where the code units can be anywhere from one to four bytes
long -- an apt description of GB 18030.

Paradoxically (at least to me), the term "multi-byte character set"
refers to a fixed-width encoding, such as UCS-2. The official name of
ISO/IEC 10646 is "Universal Multiple-Octet Coded Character Set."

(BTW, pet peeve: The word "acronym" should only be used to mean a
pronounceable WORD ("nym") formed from the initials of other words.
Classic examples are "scuba" and "radar." If you can figure out how to
pronounce "qbcs," more power to you, but to me it's just an
abbreviation.)

> This must be an oxymoron, in the sense that character sets don't
> really have a byte width, being completely abstract assignments of
> abstract characters to abstract numbers.

This is technically true, but the terms SBCS and DBCS are so entrenched
in the industry that it doesn't seem useful to try to deprecate them
now.

> So what it really means must be "quadra-byte character encoding", and
> both GB 18030 and UTF-32 should fit into that category.

GB 18030, yes, because its code units vary from one to four bytes in
length. UTF-32, no, because its code units are uniformly 32 bits.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/

Next message: Marco Cimarosti: "RE: [OT?] QBCS"
Previous message: Michael Everson: "Re: Character codes for Egyptian transliteration"
In reply to: Lars Marius Garshol: "Re: [OT?] QBCS"
Next in thread: Marco Cimarosti: "RE: [OT?] QBCS"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Aug 29 2003 - 00:14:35 EDT