Re: Case blind comparison

From: Kent Karlsson (keka@im.se)
Date: Thu Jul 31 1997 - 06:21:29 EDT

Next message: Martin J. Duerst: "Re: Unicode use for end-users"
Previous message: Graham_Rhind@otsgroup.nl: "Unicode use for end-users"
Maybe in reply to: Gary Roberts: "Case blind comparison"
Next in thread: Gary Roberts: "Re: Case blind comparison"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Gary Roberts wrote:

> Ken and Kent bring up certain canonical equivalences, which the
> technique I proposed will not handle.

> I am now tempted to include mapping
>
> U+0340 -> U+0300
...
> U+232A -> U+3009

And the fullwidth ASCII should be mapped to "ordinary ASCII",
the halfwidth Katakana should be mapped to ordinary Katakana,
the presentation forms for Arabic mapped to their ordinary forms,
the Hangul syllables mapped to their Hangul Jamo strings,
compositions should be normalised, ...

I don't want to discourage you, but comparison of Unicode strings
is non-trivial, even when case sensitive.

/kent k

Next message: Martin J. Duerst: "Re: Unicode use for end-users"
Previous message: Graham_Rhind@otsgroup.nl: "Unicode use for end-users"
Maybe in reply to: Gary Roberts: "Case blind comparison"
Next in thread: Gary Roberts: "Re: Case blind comparison"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT