Re: Compression rates of text data

From: Kent Karlsson (
Date: Mon Jun 23 1997 - 12:33:16 EDT

Randolph S. Williams wrote:
> > Does anyone have experience or information on the
> > compression of text data in Unicode? I would be
> > interested to hear if any vendors compress text data
> > stored in Unicode and how much space savings they
> > have experienced. I would be interesting in hearing
> > how various compression routines do with respect to
> > Unicode data.

Mirko Raner replied:
> there was an article about "Reuters Compression Scheme
> for Unicode" (RCSU) in the conference proceedings for
> IUC 10 (Proceedings Part 2, Slot B12). There is also a
> web document about RCSU somewhere.
> The article contains very good statistics about the
> results of applying RCSU compression to text documents
> in several languages. There are also statistics about
> RCSU followed by a secondary LZW compression.
> However, at our company (MATHEMA Software GmbH) a new,
> more efficient compression scheme is being developed
> which we will (hopefully) present at IUC 11. Special
> transformations in this compression scheme provide for
> optimal compression rates of secondary LZ-based
> algorithms.

The Ultracode (bar code based on Unicode) also uses,
before "barification", a "compression scheme" similar
to, but not the same as, the RCSU. Ultracode was also
presented at IUC 10.

See also "Unicode 2.0 based bar codes" sent to this
list on Fri, 20 Jun 1997 10:36:22 -0700 (PDT).
(Note: I have no affiliation with Zebra Technologies.)

                /kent k

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT