>>> And would it be useful
>>> to add a compatibility decomposition to the Unicode standard?
>> Sounds good to me; presumably it should be U+0073 U+0073 ("ss").
>I am working on a comprehensive default collation order for
>Unicode, and ran up against this issue even this morning while
>attempting to reconcile compatibility sequences against expected
>collation behavior. The easiest solution was to mark up the
>UnicodeData file I was using with a new compatibility decomposition
>tag. (I used "<sort> 0073 0073".)
>Since everyone implementing Unicode has to treat ß specially for
>casing and for collating,

[Alain] :
I would add "for comparisons"... useful in search engines.

Same rationale for French "oe" versus "oe" and "œ".

Slighlty problematic for "æ" which is a letter in Danish (and officially in
the UCS!), while it is used to serve as a spelling-required joined digraph
for "ae" in French (and I would say in English too).

Of course in Danish this character is, in this very precise case, much more
frequent than in French, so I would recommend the Danish case to be the
default behaviour (in French there are in most cases alternative spellings
for æ (cæsium can be written césium), unlike " œ ", where there is no
option, stricto sensu. There is no option in French for cæcum, though.

Alain LaBonté
[Kenneth] :
>it may indeed make sense to add a
>compatibility decomposition to the next edition of the UnicodeData
>file, perhaps with a new tag type. This is, however, a change to
>a normative part of the standard, and would have to get buy-in
>from the Unicode Technical Committee.

