Compression and Unicode [was: Name Compression]

From: Juliusz Chroboczek (jec@dcs.ed.ac.uk)
Date: Thu May 11 2000 - 01:20:14 EDT

Next message: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Previous message: Mark Davis: "Re: Java, UCS-2, and UTF"
Next in thread: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Marco.Cimarosti@icl.com: "RE: Compression and Unicode [was: Name Compression]"
Maybe reply: Asmus Freytag: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Marco.Cimarosti@icl.com: "RE: Compression and Unicode [was: Name Compression]"
Maybe reply: Juliusz Chroboczek: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Christopher John Fynn: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Torsten Mohrin: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Asmus Freytag: "RE: Compression and Unicode [was: Name Compression]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

mohrin@sharmahd.com (Torsten Mohrin) writes:

TM> In SC UniPad we use a compressed name table. The names are compressed
TM> by encoding the words either in one or two bytes. The separators
TM> (space and hyphen-minus) are encoded in a special way. It works as
TM> follows:

[explanation snipped]

Why not use Huffman encoding? You could precompute the Huffman tables
once and for all, compile them into your program, and only do the
actual encoding/decoding at runtime.

It would be a little bit more computationally expensive than your
scheme due to the need to access parts of bytes, but would yield a
much better compression ratio.

More generally, I get the impression that the Unicode community is
particularly keen on inventing /ad hoc/ compression schemes. I still
haven't heard a sound rationale for the existence of the SCCS. What's
wrong with patent-free variants of LZW?

Next message: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Previous message: Mark Davis: "Re: Java, UCS-2, and UTF"
Next in thread: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Mark Davis: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Marco.Cimarosti@icl.com: "RE: Compression and Unicode [was: Name Compression]"
Maybe reply: Asmus Freytag: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Marco.Cimarosti@icl.com: "RE: Compression and Unicode [was: Name Compression]"
Maybe reply: Juliusz Chroboczek: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Christopher John Fynn: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Torsten Mohrin: "Re: Compression and Unicode [was: Name Compression]"
Maybe reply: Asmus Freytag: "RE: Compression and Unicode [was: Name Compression]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT