RE: Diacritic and similar foldings and spam filtering

From: Mike Ayers (
Date: Thu Jul 08 2004 - 18:32:00 CDT

  • Next message: Michael \(michka\) Kaplan: "Re: Looking for transcription or transliteration standards latin- >arabic"

    > From: []On
    > Behalf Of Peter Kirk
    > Sent: Thursday, July 08, 2004 2:36 PM

    > Could something like this be defined within the framework of UTR #30?
    > Should it be defined within the UTR? I suspect it would be
    > better left
    > to the discretion of individual developers, who could then rapidly
    > tailor their foldings to any new lookalikes exploited by spammers.

            Absolutely and exactly. The tricks in spam move much too fast for
    standardization to keep up, and the work would have no value outside of the
    spam arena, which is of a limited lifespan, at least as we know it. The
    more general diacritic removal work, however, is quite useful.


    This archive was generated by hypermail 2.1.5 : Thu Jul 08 2004 - 18:33:04 CDT