Re: Case mapping of dotless lowercase letters

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Dec 16 2003 - 20:21:53 EST

  • Next message: Chris Jacobs: "Re: Case mapping of dotless lowercase letters"

    Correcting myself:

    > Note that none of the 3 sets of equivalence classes violates
    > *canonical* equivalence, because none of the 8 sequences involved
    > is canonically equivalent to any other. In other words, no matter
    > which of the 3 approaches you take to case folding, in no instance
    > are you claiming that canonically equivalent sequences are to be
    > interpreted differently.

    Actually, dotted I *is* canonically equivalent to <I, dot above>
    (I overlooked that when compiling the summary.)

    Hence the equivalence classes for simple case folding:

       C. { dotted I }
       D. { <i, dot above>, <I, dot above> }
       
    *do* violate canonical equivalence. And that is the whole
    reason for the separate definition of full case folding,
    which defines the equivalence class:

       G. { dotted I, <i, dot above>, <I, dot above> }

    which observes canonical equivalence, but which has the
    drawback of string length change in case folding.

    --Ken



    This archive was generated by hypermail 2.1.5 : Tue Dec 16 2003 - 21:00:27 EST