Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

From: Frank Yung-Fong Tang (ytang0648@aol.com)
Date: Tue Dec 02 2003 - 18:03:59 EST

  • Next message: Michael Everson: "RE: MS Windows and Unicode 4.0 ?"

    Mark Davis wrote:

    > > >> UTF-16 6,634,430 bytes
    > > >> UTF-8 7,637,601 bytes
    > > >> SCSU 6,414,319 bytes
    > > >> BOCU-1 5,897,258 bytes
    > > >> Legacy encoding (*) 5,477,432 bytes
    > > >> (*) KS C 5601, KS X 1001, or EUC-KR)

    What is the size of gzip these? Just wonder
    gzip of UTF-16
    gzip of UTF-8
    gzip of SCSU
    gzip of BOCU-1
    gzip of Legacy encoding

    -- 
    --
    Frank Yung-Fong Tang
    Šýštém Årçhîtéçt, Iñtërnâtiônàl Dèvélôpmeñt, AOL Intèrâçtívë Sërviçes
    AIM:yungfongta   mailto:ytang0648@aol.com Tel:650-937-2913
    Yahoo! Msg: frankyungfongtan
    


    This archive was generated by hypermail 2.1.5 : Tue Dec 02 2003 - 18:45:00 EST