Re: Ternary search trees for Unicode dictionaries

From: Theodore H. Smith (delete@elfdata.com)
Date: Mon Nov 17 2003 - 21:21:05 EST

  • Next message: Doug Ewell: "Re: Problems encoding the spanish o"

    I've looked into the TST thing.

    I'm not sure that it is optimal, despite how popular they are!

    Look at this, if I add "1", "2","3", "4", "5", "6", "7", "8", "9" to a
    TST, they will all be in a line, in the tree. All will be reference via
    the "high" node.

    So, to find "9", I have to read through 9 items!

    Now, I'm not sure if this is a bad thing. It's just that compared to a
    binary search on an array, in this example, a TST does more work.

    When reading in data, that has already been sorted, I think this could
    be a problem. It's fine for randomly ordered data, but with sorted
    data... I can't tell how much of a problem it could be.

    This is the kind of structure that punishes neat people. An unexpected
    payback for the effort of being neat and ordered.

    I'm quite new to this, and so I don't know if there is a good solution
    to making the tree's balanced without imposing huge overhead, or if in
    practice, this will be a problem.



    This archive was generated by hypermail 2.1.5 : Mon Nov 17 2003 - 22:28:14 EST