Re: property, character, and sequence name loose matching

From: Andrew West (andrewcwest@gmail.com)
Date: Thu Mar 11 2010 - 17:32:01 CST

  • Next message: John W Kennedy: "Re: ß vs. ſs"

    On 11 March 2010 20:32, karl williamson <public@khwilliamson.com> wrote:
    >
    > I think it is actually better to do the following:
    > 1. Remove all white space
    > 2. Collapse multiple hyphens in a row into one
    > 3. Lowercase
    > 4. If the result is one of the three problematic ones, we are done.
    > 5. Remove all hyphens
    >
    > Then, if the strings are the same after the transforms, they match.

    No, then "TIBETAN MARK TSA PHRU" would match "TIBETAN MARK TSA -PHRU",
    which may be what the user intended, but it is not what they asked
    for, and would be as bad as matching e.g. "PERCENT IGN" and "PERCENT
    SIGN".

    Andrew



    This archive was generated by hypermail 2.1.5 : Thu Mar 11 2010 - 17:34:11 CST