Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

From: Doug Ewell (
Date: Sun Nov 23 2003 - 15:10:33 EST

  • Next message: Doug Ewell: "Re: Ternary search trees for Unicode dictionaries"

    Mark Davis <mark dot davis at jtcsv dot com> wrote:

    >> Of course, no compression format applied to jamos could
    >> even do as well as UTF-16 applied to syllables, i.e. 2 bytes per
    >> syllable.
    > This needs a bit of qualification. An arithmetic compression would do
    > better, for example, or even just a compression that took the most
    > frequent jamo sequences. Perhaps the above is better phrased as 'no
    > simple byte-level compression format...'.

    Yes, that's what I meant: a compression *format* like SCSU or BOCU-1, as
    opposed to a (general-purpose) compression *algorithm* like Huffman or
    LZ or arithmetic coding. The distinction makes sense in the context of
    my paper, but I probably should have explained it here.

    BTW, the paper is awaiting final comments from one last reviewer.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sun Nov 23 2003 - 15:52:23 EST