Re: data for collation tests

From: Alain LaBont/e'/ (alb@sct.gouv.qc.ca)
Date: Fri Feb 07 1997 - 11:43:47 EST


At 07:03 97-02-07 -0800, Michael Everson wrote:
>Letter-by-letter sorting is used in Danish; word-by-word sorting is used in
>English. ISO CD 14651 specifies a toggle in the default sort since you can
>never tell, really, what people will want. (Alain LaBonti doesn't like
>word-by-word sorting; most OS's use it however.)

Don't forget that words may include spaces, even in the OED, but certainly
in French... "vice versa" is one word in French, each separated group of
letters is not a word in this case and there is a myriad of other
examples... This notion of word that you use is Germanic. Good in this
context where spaces would be removed to form a new "word". OED has tens of
pages of definitions for "word", including one for which it is precisely
stated that a word may contain spaces.

Alain LaBonti
Quibec



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT