RE: That UTF-8 Rant

From: Gianni Mariani (gianni@corp.webtv.net)
Date: Thu Jul 22 1999 - 22:46:42 EDT


gzip would do a whole lot better than encoding, besides, disk
space is so cheap even in the 100's of GB range. I bought a
100 GB storage system for $3000 for my home PC ! I'm sure
you're custy could spend a couple of extra bucks for utf-8
without breaking a sweat ! You're consulting fee should cost
a whole lot more to convert to sjis.

-----Original Message-----
From: Gary Roberts [mailto:gar@sparc.sandiegoca.ncr.com]
Sent: Thursday, July 22, 1999 5:38 PM
To: Unicode List
Subject: Re: That UTF-8 Rant

On Thu, 22 Jul 1999, Markus Kuhn wrote:

> Actually, I happen to be extremely interested in exactly these
> questions, because I happen to be someone who makes implementation
> decisions about databases that could one day grow into the
> hundreds-of-gigabyte range. I have not yet seen multi-terabyte plain
> text databases though (perhaps the email/fax eavesdroppers at the NSA
> have these, if anyone ;-), these tend more to be filled with images and
> not text.

We have many customers with multi-terabyte databases. Our Japanese
customers in particular have claimed a high percentage of character data
(The rest is almost entirely numeric). Our Unicode (UTF-16)
implementation is criticized as being inefficient in storage relative to
Shift-JIS (which we also support). I suspect a UTF-8 implementation would
be unpopular.
                                        *



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT