Re: A basic question on encoding Latin characters

From: John Cowan (cowan@locke.ccil.org)
Date: Tue Sep 28 1999 - 16:47:23 EDT


Kenneth Whistler scripsit:

> Of course it is. If the application is waiting for "login:", it is not
> waiting for "login:" with an acute accent on the colon. It is interpreting
> what it is supposed to, given the characters encoded at the code
> values they have. If the communicator then sends a combining acute accent,
> that is a *protocol* error, not a Unicode compliance problem.

But suppose the application is waiting for a word ending in c-acute.
Unicode conformance rule 9 requires that either U+0107 or U+0063 U+0301
must be accepted. So far so good, although the end-state detector
has to be more complicated.

Remember that this is an ad-hoc situation, where the communicator is
not expecting to talk to a scripting application. This is not a matter
of protocol design, but rather of working with an existing partner
which cannot be easily changed.

If another possible protocol state is triggered by a string ending in
U+0063, then the problem becomes the severe version already described.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT