Re: C1 controls and terminals (was: Re: Euro character in ISO)

From: Frank da Cruz (
Date: Wed Jul 12 2000 - 11:42:07 EDT

> Frank da Cruz <> wrote:
> > . If you send a code in the 0x80-8x9f range to such a terminal or
> > emulator, it properly treats it as a control code. If it was
> > intended as a graphic character ("smart quote" or somesuch) the
> > result is a fractured screen, sometimes even a frozen session.
> This is the widely reported compatibility problem between UTF-8 and
> terminals. I know I read somewhere, possibly on Markus Kuhn's Unicode
> page, possibly somewhere else, that ISO 2022 codes exist to switch out
> of "ISO 2022 mode" and into "UTF-8 mode" and to either allow or prevent
> switching back to 2022. Is there any progress on implementing this so
> terminals and emulators can live with UTF-8?
Maybe Markus can clarify. I would be surprised if there's anything in
ISO 2022 about UTF8, except that it does provide a way to switch out of
and back into ISO 2022 mode, allowing the use of character sets that do
not comply with ISO 2022 and 4873. That's what the designating escape
sequences "with standard return" and "without standard return" are for.

But that's not quite the same thing. There is no good reason why UTF-8
couldn't be used by (say) a VT320 emulator without switching out of the
ISO 2022 regime, except that UTF-8 contains C1 control codes as data.
This was discussed here a while back and "the other Markus" showed how
a C1-safe form of UTF-8 could have been designed:

But, as they say, "it's too late now". Therefore, those of us who want
to make use of UTF-8 within the ISO 2022 regime must reverse the layers.
First decode the UTF-8, then parse for escape sequences. Of course your
emulator can get into awful trouble that way if the data stream isn't
really UTF-8. But overall it's not that bad; we can live with it, and
in fact have done it this way in practice in our own emulator.

- Frank

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT