Re: Is it save to dig into comment contents of PropList.txt?

From: Markus Scherer <markus.icu_at_gmail.com>
Date: Tue, 5 Nov 2013 14:27:19 -0800

On Tue, Nov 5, 2013 at 12:57 PM, Steffen Daode <sdaoden_at_gmail.com> wrote:

> Saving space is a particular frightening topic, but i'm far away
> from that (since there is so few functionality yet). The ICU 52.1
> data library i've compiled two weeks ago is incredible 23.5 MB,
> and that after more than a decade engineering experience, so...
> the lunatic is at least ahead.
>

The majority of that data is CLDR data for 600+ locales, plus some 4MB of
character conversion mapping tables, some 4.5MB of collation data (sorting
& searching), some 2.5MB of line-break dictionaries, transliteration rules,
...

The Unicode NFC/NFD data in ICU is about 32kB, the case mappings some 20kB,
bidi some 20kB, miscellaneous properties are somewhere near 50kB I think.
(All sizes from memory, didn't double-check them.) It is possible to make
them smaller, but I chose to construct the data so that it's
memory-mappable and usable as is at runtime.

markus
Received on Tue Nov 05 2013 - 16:29:42 CST

This archive was generated by hypermail 2.2.0 : Tue Nov 05 2013 - 16:29:43 CST