Re: FW: Matching Unicode strings and combining characters [was: basic

From: John Cowan (cowan@locke.ccil.org)
Date: Thu Sep 30 1999 - 12:36:47 EDT


Marco.Cimarosti@icl.com scripsit:

> In fact, as has arleady been said, why should the case of "login:^" be any
> different from the case of "login:Q"?
>
> What is the NEW problem brought by unicode or combining characters?

The NEW problem is canonical equivalence: the idea that some user-level
graphemes can be transmitted as one character or two, at the sender's
option, since the receiver must not differentiate between the two
different representations.

> Somebody says: if my application is waiting for "login", it will not trigger
> if it receives "logiñ" (where ñ is a precomposed) but it would trigger with
> "login~", (where ~ is a combining mark). That is true, so what!?

The point is that either "logi0xf1" or "login~" must be acceptable,
and distinct from "login". Before combining characters, this wasn't
a problem.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT