Re: property, character, and sequence name loose matching

From: karl williamson (public@khwilliamson.com)
Date: Fri Mar 12 2010 - 00:12:38 CST

Next message: Michael Everson: "Re: ß vs. ſs"

Previous message: Phillips, Addison: "RE: New First Public Working Draft: Additional Requirements for Bidi in HTML"
In reply to: Andrew West: "Re: property, character, and sequence name loose matching"
Next in thread: Asmus Freytag: "Re: property, character, and sequence name loose matching"
Reply: Asmus Freytag: "Re: property, character, and sequence name loose matching"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Andrew West wrote:
> On 11 March 2010 20:32, karl williamson <public@khwilliamson.com> wrote:
>> I think it is actually better to do the following:
>> 1. Remove all white space
>> 2. Collapse multiple hyphens in a row into one
>> 3. Lowercase
>> 4. If the result is one of the three problematic ones, we are done.
>> 5. Remove all hyphens
>>
>> Then, if the strings are the same after the transforms, they match.
>
> No, then "TIBETAN MARK TSA PHRU" would match "TIBETAN MARK TSA -PHRU",
> which may be what the user intended, but it is not what they asked
> for, and would be as bad as matching e.g. "PERCENT IGN" and "PERCENT
> SIGN".
>
> Andrew
>

OK, but that is a change from what TR18 says: "names should use a loose
match, disregarding case, spaces and hyphen" except for the three
problematic situations it mentions. There is no character TIBETAN MARK
TSA PHRU, and I thought the whole point of loose matching is to follow
the intent of the user even in the face of certain missing or extraneous
punctuation and spacing characters, so even though it is not exactly
what they asked for, it is close enough by the traditional definition.

I realize that TR18 is not an official part of the standard, and that
TR44 is now UAX44, so is. Therefore, this is a change in the standard
that I don't believe was listed as a delta.

Next message: Michael Everson: "Re: ß vs. ſs"
Previous message: Phillips, Addison: "RE: New First Public Working Draft: Additional Requirements for Bidi in HTML"
In reply to: Andrew West: "Re: property, character, and sequence name loose matching"
Next in thread: Asmus Freytag: "Re: property, character, and sequence name loose matching"
Reply: Asmus Freytag: "Re: property, character, and sequence name loose matching"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Mar 12 2010 - 00:18:23 CST