Re: UTS#10 (collation) : request for a new "Separating" mode for variable weighting (3.2.2)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Jul 31 2010 - 09:04:43 CDT

  • Next message: Philippe Verdy: "re: UTS#10 (collation) : request for a new "Separating" mode for variable weighting (3.2.2)"

    As an option (which may be in fact the default), this mode should also
    treat all runs of "variable" elements by only making the FIRST of them
    with a non-ignorable primary weight [.0201].

    If there are several "variable" elements, the subsequent ones will get
    the primary weight [.0000] instead, but they will preserve all their 3
    other weights.

    So in this mode with this option for runs of variable elements:

    " " (one spaces) would collate as [.0201.020A.0020.0002.0020]

    " " (two spaces) would collate as
    [.0201.020A.0020.0002.0020][.0000.020A.0020.0002.0020]

    This option for runs of variable elements is useful to collate
    together, at the primary level two equal words that are separated by
    different numbers of spaces.

    This new "Separating" mode will also be very useful for plain-text
    searches (working at level 1 only), as a MUCH better alternative to
    "Shifted" which finds too many occurences by ignoring all word
    separations and all spaces in arbitrary runs). So when searching for
    "km/h" it will still find "km h", or "km-h", or "km h" (two spaces in
    the middle), or "km / h"

    Philippe.



    This archive was generated by hypermail 2.1.5 : Sat Jul 31 2010 - 09:06:34 CDT