Re: Ternary search trees for Unicode dictionaries

From: Mark Davis (mark.davis@jtcsv.com)
Date: Tue Nov 18 2003 - 00:39:14 EST

Next message: Jungshik Shin: "Re: How can I input any Unicode character if I know its hexadecimal code?"

Previous message: Eric Muller: "Linguistic Diversity and National Unity: Language Ecology in Thailand"
In reply to: Theodore H. Smith: "Re: Ternary search trees for Unicode dictionaries"
Next in thread: Philippe Verdy: "Re: Ternary search trees for Unicode dictionaries"
Reply: Philippe Verdy: "Re: Ternary search trees for Unicode dictionaries"
Reply: Theodore H. Smith: "Re: Ternary search trees for Unicode dictionaries"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

We tend to use tries, which have very good performance characteristics. See
"bits of unicode" on my site: www.macchiato.com.

Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄

----- Original Message -----
From: "Theodore H. Smith" <delete@elfdata.com>
To: <unicode@unicode.org>
Sent: Mon, 2003 Nov 17 18:21
Subject: Re: Ternary search trees for Unicode dictionaries

> I've looked into the TST thing.
>
> I'm not sure that it is optimal, despite how popular they are!
>
> Look at this, if I add "1", "2","3", "4", "5", "6", "7", "8", "9" to a
> TST, they will all be in a line, in the tree. All will be reference via
> the "high" node.
>
> So, to find "9", I have to read through 9 items!
>
> Now, I'm not sure if this is a bad thing. It's just that compared to a
> binary search on an array, in this example, a TST does more work.
>
> When reading in data, that has already been sorted, I think this could
> be a problem. It's fine for randomly ordered data, but with sorted
> data... I can't tell how much of a problem it could be.
>
> This is the kind of structure that punishes neat people. An unexpected
> payback for the effort of being neat and ordered.
>
> I'm quite new to this, and so I don't know if there is a good solution
> to making the tree's balanced without imposing huge overhead, or if in
> practice, this will be a problem.
>
>
>

Next message: Jungshik Shin: "Re: How can I input any Unicode character if I know its hexadecimal code?"
Previous message: Eric Muller: "Linguistic Diversity and National Unity: Language Ecology in Thailand"
In reply to: Theodore H. Smith: "Re: Ternary search trees for Unicode dictionaries"
Next in thread: Philippe Verdy: "Re: Ternary search trees for Unicode dictionaries"
Reply: Philippe Verdy: "Re: Ternary search trees for Unicode dictionaries"
Reply: Theodore H. Smith: "Re: Ternary search trees for Unicode dictionaries"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Nov 18 2003 - 01:27:53 EST