RE: Case mapping of dotless lowercase letters

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Dec 16 2003 - 21:55:58 EST

  • Next message: John Cowan: "Re: Stability of WG2"

    Kenneth Whistler
    > Correcting myself:
    > > Note that none of the 3 sets of equivalence classes violates
    > > *canonical* equivalence, because none of the 8 sequences involved
    > > is canonically equivalent to any other. In other words, no matter
    > > which of the 3 approaches you take to case folding, in no instance
    > > are you claiming that canonically equivalent sequences are to be
    > > interpreted differently.
    >
    > Actually, dotted I *is* canonically equivalent to <I, dot above>
    > (I overlooked that when compiling the summary.)

    And I had the same conclusion in my previous long analysis, except
    that I did not forgot this canonical equivalence.

    Except also that I used another notation to compare case foldings
    and case mappings.

    I also concluded that using combining dots with i's was a big
    hack, and that this hack was introduced only in the Full
    case mappings, just to confuse implementations, and make the
    life even worse for programmers that expect a correct behavior
    with case folding.

    Morality: I don't use now case folding which preserves
    canonical equivalence with a hack,
    but only lowercase(uppercase()) which respects canonical
    equivalence, and is more coherent for full text indexing,
    secured identification, cases-insensitive file naming...

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Tue Dec 16 2003 - 22:38:04 EST