Re: unidata is big

From: Mark Davis (
Date: Tue Apr 23 2002 - 16:59:10 EDT

One of the Dublin papers talks about how this is done in ICU:


Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see]

----- Original Message -----
From: "Geoffrey Waigh" <>
To: <>
Sent: Sunday, April 21, 2002 03:28
Subject: Re: unidata is big

> > I would just like to know if someone could give me a tip on how to
> > structure all the unicode-information in memory?
> >
> > All the UNIDATA does contain quite a bit of information and I
can't see
> > any obvious method of which is memory-efficient and gives fast
> a) you see if there is a Unicode friendly library you can use that
> does this for you.
> b) you write a program to parse the file and extract what your
> needs. With clever data encoding you can pack most of the fields of
> UNIDATA into a very tight space. Long ago in the Unicode conference
> proceedings somebody illustrated how they used trie structures to
> efficiently
> build the lookup tables - the boring parts of the encoding space
> shorter branches than the areas where every codepoint is different
> it's neighbour.
> Geoffrey

This archive was generated by hypermail 2.1.2 : Tue Apr 23 2002 - 17:40:22 EDT