Re: Nicest UTF

From: Arcane Jill ([email protected])
Date: Mon Dec 06 2004 - 02:39:24 CST

Next message: Peter R. Mueller-Roemer: "OpenType not for Open Communication?"

Previous message: Doug Ewell: "Re: Nicest UTF"
Maybe in reply to: Theodore H. Smith: "Nicest UTF"
Next in thread: Doug Ewell: "Re: Nicest UTF"
Reply: Doug Ewell: "Re: Nicest UTF"
Reply: Philippe Verdy: "Re: Nicest UTF.. UTF-9, UTF-36, UTF-80, UTF-64, ..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Probably a dumb question, but how come nobody's invented "UTF-24" yet? I
just made that up, it's not an official standard, but one could easily
define UTF-24 as UTF-32 with the most-significant byte (which is always
zero) removed, hence all characters are stored in exactly three bytes and
all are treated equally. You could have UTF-24LE and UTF-24BE variants, and
even UTF-24 BOMs. Of course, I'm not suggesting this is a particularly
brilliant idea, but I just wonder why no-one's suggested it before.

(And then of course, there's UTF-21, in which blocks of 21 bits are
concatenated, so that eight Unicode characters will be stored in every 21
bytes - and not to mention UTF-20.087462841250343, in which a plain text
document is simply regarded as one very large integer expressed in radix
1114112, and whose UTF-20.087462841250343 representation is simply that
number expressed in binary. But now I'm getting /very/ silly - please don't
take any of this seriously.) :-)

The "UTF-24" thing seems a reasonably sensible question though. Is it just
that we don't like it because some processors have alignment restrictions or
something?

Arcane Jill

-----Original Message-----
From: [email protected] [mailto:[email protected]]On
Behalf Of Marcin 'Qrczak' Kowalczyk
Sent: 02 December 2004 16:59
To: [email protected]
Subject: Re: Nicest UTF

"Arcane Jill" <[email protected]> writes:
> Oh for a chip with 21-bit wide registers!
Not 21-bit but 20.087462841250343-bit :-)

-- 
__("< Marcin Kowalczyk
\__/ [email protected]
^^ http://qrnik.knm.org.pl/~qrczak/

Next message: Peter R. Mueller-Roemer: "OpenType not for Open Communication?"
Previous message: Doug Ewell: "Re: Nicest UTF"
Maybe in reply to: Theodore H. Smith: "Nicest UTF"
Next in thread: Doug Ewell: "Re: Nicest UTF"
Reply: Doug Ewell: "Re: Nicest UTF"
Reply: Philippe Verdy: "Re: Nicest UTF.. UTF-9, UTF-36, UTF-80, UTF-64, ..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Dec 06 2004 - 02:43:14 CST