RE: Non-ascii string processing?

From: Jill Ramonsky (
Date: Mon Oct 06 2003 - 12:31:09 CST

Nor I. "Characters" are perhaps the most useless objects ever invented.

Now - a count of DEFAULT GRAPHEME CLUSTERs might be useful (for example,
for display on a console which uses fixed-width fonts). Indeed, a whole
class of DEFAULT GRAPHEME CLUSTER handling functions might come in very
handy indeed. Bytes are useful. Default grapheme clusters are useful.
But a "character"? What's the point?

But then, a default grapheme cluster might theoretically require up to
16 Unicode characters. (Maybe more, I don't know). Even bit-packed to 21
bits per character, that still gives us 336 bits. So I conclude that our
string processing functions could go a lot faster if only we'd all use
UTF-336. Er....?


> -----Original Message-----
> From: Marco Cimarosti []
> Sent: Monday, October 06, 2003 11:10 AM
> To: 'Doug Ewell'; Unicode Mailing List
> Cc: Theodore H. Smith
> Subject: RE: Non-ascii string processing?
> What strlen() cannot do is countÓng the number of
> *characters* in a string.
> But who cares? I can imagine very few situations where someone such an
> information would be useful.
> _ Marco

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST