Re: UTF-8 and Kermit

From: Martin J. Duerst (mduerst@ifi.unizh.ch)
Date: Wed Jul 16 1997 - 04:23:38 EDT


On Tue, 15 Jul 1997, Frank da Cruz wrote:

> You have -- thanks. I'll do some reading before making any more noise in
> this vein. And yet, from what you say, it sounds like there is still
> evidently a gotcha, namely that one must choose in advance between two
> different worlds: UTF-8 and ISO 2022 (e.g. Latin-1, etc, with C1 controls).
> So if the host uses UTF-8 but I am using a VT320 (or emulator), or the host
> uses ISO 2022 with C1 controls and my emulator is switched to UTF, there
> will be trouble. But that's not so bad, since we must do this now anyway
> prior to making connections to systems that don't use ISO 2022, or that use
> PC code pages (etc) on the wire.

Because UTF-8 is constructed so that it allows character boundary detection,
it is extremely regular and therefore easy to detect. The same applies for
ISO 2022 with C1 controls; before you can use SS2 or SS3, you have to
declare what you want in G2 or G3. Automatic detection needs a little bit
of code, but is not too difficult.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT