Re: Ternary search trees for Unicode dictionaries

From: Mark Davis (
Date: Tue Nov 18 2003 - 00:39:14 EST

  • Next message: Jungshik Shin: "Re: How can I input any Unicode character if I know its hexadecimal code?"

    We tend to use tries, which have very good performance characteristics. See
    "bits of unicode" on my site:

    ► शिष्यादिच्छेत्पराजयम् ◄

    ----- Original Message -----
    From: "Theodore H. Smith" <>
    To: <>
    Sent: Mon, 2003 Nov 17 18:21
    Subject: Re: Ternary search trees for Unicode dictionaries

    > I've looked into the TST thing.
    > I'm not sure that it is optimal, despite how popular they are!
    > Look at this, if I add "1", "2","3", "4", "5", "6", "7", "8", "9" to a
    > TST, they will all be in a line, in the tree. All will be reference via
    > the "high" node.
    > So, to find "9", I have to read through 9 items!
    > Now, I'm not sure if this is a bad thing. It's just that compared to a
    > binary search on an array, in this example, a TST does more work.
    > When reading in data, that has already been sorted, I think this could
    > be a problem. It's fine for randomly ordered data, but with sorted
    > data... I can't tell how much of a problem it could be.
    > This is the kind of structure that punishes neat people. An unexpected
    > payback for the effort of being neat and ordered.
    > I'm quite new to this, and so I don't know if there is a good solution
    > to making the tree's balanced without imposing huge overhead, or if in
    > practice, this will be a problem.

    This archive was generated by hypermail 2.1.5 : Tue Nov 18 2003 - 01:27:53 EST