> But do "friend" and "frigate" appear in Maltese dictionaries? Is it
> reasonable to expect that a single collating spec can correctly order words
> that follow two different collating conventions all at once?
My understanding of the post was that "friend" *is* a Maltese word
(in the same sense that "résumé" is an English word), but that it does
not contain the Maltese letter "ie". Therefore, there needs to be
a way to know when "ie" is to collate as a single letter and when
it is not to do so.
> Suppose it were the case that Maltese alphabetic order put the letter l
> before f.
I have a recollection of seeing a list of Chinese words written in pinyin
but alphabetized according to bopomofo rules. Is this commonplace?
> As a result, if we
> had a mixture of English (or French - more likely for Benin) and Adja
> words, we couldn't sort them using a single set of rules and have them come
> out correct for both languages at once.
To be sure. The trouble arises when the digraph sometimes sorts one way
and sometimes another. IIRC, in Danish "aa" sorts as "ĺ" when it is an
archaic rendering of it, as in "Aarhus", but as "aa" when it is a borrowing,
as in "aardvark". That's the same case as Maltese, no? What is commonly
done for Danish sorting?
> In general, the best solution is: if a language has borrowings from another
> language, they take on the conventions of the receptor language.
A good principle, but it doesn't always work in the Real World.
Summary: Of course I agree with you that adding characters is not the answer,
and that tailored collation sequences mostly are --- but some languages
may have collation rules that look internally inconsistent when represented
in Unicode, and may require tricks with ZWNBSP, which I think is the Right
Thing in this case.
Schlingt dreifach einen Kreis um dies! || John Cowan <email@example.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT