John Cowan recently said:
> My understanding of the post was that "friend" *is* a Maltese word
> (in the same sense that "résumé" is an English word), but that it does
> not contain the Maltese letter "ie". Therefore, there needs to be
> a way to know when "ie" is to collate as a single letter and when
> it is not to do so.
> Peter_Constable@sil.org wrote:
> > In general, the best solution is: if a language has borrowings from another
> > language, they take on the conventions of the receptor language.
> A good principle, but it doesn't always work in the Real World.
> Summary: Of course I agree with you that adding characters is not the answer,
> and that tailored collation sequences mostly are --- but some languages
> may have collation rules that look internally inconsistent when represented
> in Unicode, and may require tricks with ZWNBSP, which I think is the Right
> Thing in this case.
I'm not sure that you can expect users to put special codes in for rare cases,
especially if they do not affect the visible rendering. Someone is bound to
forget or you may be importing text from another system.
I think tailored collation sequences are the way to go. If you really want
foreign words like friend sorted differently, then compose a list of the
problematic ones and add explicit collation rules. E.g. friend becomes
<f> <r> <i> <e> <n> <d> (That will deal with friendship and pen-friend if they
are in the language too.) This could also be done with smaller sequences of
text that represented syllables if that was easier and unambiguous.
-- Tim Partridge. Any opinions expressed are mine only and not those of my employer
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT