A 10:03 97-08-15 -0700, Kenneth Whistler a écrit :
>>> And would it be useful
>>> to add a compatibility decomposition to the Unicode standard?
>> Sounds good to me; presumably it should be U+0073 U+0073 ("ss").
>I am working on a comprehensive default collation order for
>Unicode, and ran up against this issue even this morning while
>attempting to reconcile compatibility sequences against expected
>collation behavior. The easiest solution was to mark up the
>UnicodeData file I was using with a new compatibility decomposition
>tag. (I used "<sort> 0073 0073".)
>Since everyone implementing Unicode has to treat ß specially for
>casing and for collating,
I would add "for comparisons"... useful in search engines.
Same rationale for French "oe" versus "oe" and "œ".
Slighlty problematic for "æ" which is a letter in Danish (and officially in
the UCS!), while it is used to serve as a spelling-required joined digraph
for "ae" in French (and I would say in English too).
Of course in Danish this character is, in this very precise case, much more
frequent than in French, so I would recommend the Danish case to be the
default behaviour (in French there are in most cases alternative spellings
for æ (cæsium can be written césium), unlike " œ ", where there is no
option, stricto sensu. There is no option in French for cæcum, though.
>it may indeed make sense to add a
>compatibility decomposition to the next edition of the UnicodeData
>file, perhaps with a new tag type. This is, however, a change to
>a normative part of the standard, and would have to get buy-in
>from the Unicode Technical Committee.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT