Re: Changing UCA primarly weights (bad idea)

From: Peter Kirk (peterkirk@qaya.org)
Date: Sat Jul 10 2004 - 03:26:25 CDT

  • Next message: Michael Everson: "Re: Changing UCA primary weights (bad idea)"

    On 10/07/2004 01:34, Mark Davis wrote:

    >I'll try to pick out the relevant points.
    >
    >
    >
    >>Please do. Do you really want all those letters
    >>between "e" and "f" interfiled with "e"? I surely
    >>do not.
    >>
    >>
    >
    >You seem to have a misperception of what I think we should be looking at.
    >What I think we should be examining is which of the items that are not
    >interfiled (to use your phrasing) should be, if any. I don't think
    >everything should be. In particular, I think John's list is the list we
    >should be focusing on.
    >
    >
    >
    >>John's list?
    >>
    >>
    >
    >That's was in my original mail, that you were commenting on when you changed
    >the subject line, but which you didn't apparently didn't bother to actually
    >read. Here is the text:
    >
    >
    >
    >>>If you look at John's suggested file for diacritic
    >>>folding(http://www.ccil.org/~cowan/DiacriticFolding.txt), there are quite
    >>>
    >>>
    >a
    >
    >
    >>>number that are not reflected in the UCA.
    >>>
    >>>
    >
    >
    >
    >>My point is made here. It is really only in
    >>initial position where this is likely to be
    >>noticed.
    >>
    >>
    >
    >This is incorrect. It will make a difference in other positions. Sorting
    >"Søren" after "Sozar" in a long list, if someone isn't expecting it, will
    >cause problems. They look for it after "Soret", don't see it on the page,
    >and assume it isn't there; fooled by the fact that it is on a completely
    >different page.
    >
    >

    I agree with you on this. I just checked this with some real data, a set
    of several thousand e-mail messages from a list. One Danish participant
    is Søren Holst and so called in the name field of his e-mails, but signs
    himself "Soren" in messages in English. If I type "Soren" into the name
    search box (in Mozilla 1.7), I get no matches. This is not what I
    expect, because to me, and to Søren himself when thinking in English, ø
    is a variant of o. (But actually Mozilla is inconsistent: when sorting
    it put Søren after Sonny but before Soshie.)

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Sat Jul 10 2004 - 03:27:17 CDT