Re: Case mapping of dotless lowercase letters

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Dec 15 2003 - 12:45:27 EST

  • Next message: Wm Seán Glen: "Re: Swastika to be banned by Microsoft?"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > I agree with your argument related to the difference between dotted
    > and dotless letters, except that the current case mappings make a
    > difference of behavior when comparing uppercase words or lowercase
    > words: a difference is kept in the case mappings for the lowercase
    > words, which is not kept for the case mappings of the uppercase words.

    I'm no expert on the default case mappings, but it does seem odd if,
    say, "Diyarbakır" is distinguished from misspellings involving the two
    different i's, but "DÄ°YARBAKIR" is not. Perhaps someone will come on
    and explain why this is so.

    > The consequence is that two words that compare distinct with case
    > mappings will no longer compare distinct if they are converted to
    > uppercase with the default locale-neutral full mappings (this problem
    > does not occur with the Turkic-specific full case mappings). That's
    > all what I say, and I don't want to reform the case mappings for
    > Turkic languages, just demonstrate a caveat for the default locale-
    > neutral mappings.

    The caveats are well-known and well-publicized. If you want
    Turkic-specific case mapping behavior, you really have to use the
    Turkic-specific case mapping tables.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Mon Dec 15 2003 - 13:38:37 EST