One of the Dublin papers talks about how this is done in ICU:
Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]
----- Original Message -----
From: "Geoffrey Waigh" <firstname.lastname@example.org>
Sent: Sunday, April 21, 2002 03:28
Subject: Re: unidata is big
> > I would just like to know if someone could give me a tip on how to
> > structure all the unicode-information in memory?
> > All the UNIDATA does contain quite a bit of information and I
> > any obvious method of which is memory-efficient and gives fast
> a) you see if there is a Unicode friendly library you can use that
> does this for you.
> b) you write a program to parse the file and extract what your
> needs. With clever data encoding you can pack most of the fields of
> UNIDATA into a very tight space. Long ago in the Unicode conference
> proceedings somebody illustrated how they used trie structures to
> build the lookup tables - the boring parts of the encoding space
> shorter branches than the areas where every codepoint is different
> it's neighbour.
This archive was generated by hypermail 2.1.2 : Tue Apr 23 2002 - 17:40:22 EDT