Re: A basic question on encoding Latin characters

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Wed Sep 29 1999 - 16:59:01 EDT


Paul Keinanen wrote:
> ...
> The terminal would have to send the NUL character after each normal or
> combining character entered on the keyboard and the remote echo would
> have to also echo back this NUL character. However, the terminal
> driver would have to strip off the NULL before returning the character
> or combined character back to the application.
>
I don't think there is much point in designing new techniques for the
"host" that issues the prompt. These are not just UNIX computers, they are
modems, routers, terminal servers, diverse embedded controllers, medical
equipment, die-cutting machines, and other devices that we can't change (*).

(*) I recently had a tech support call from a user who was having
    difficulty scripting Kermit to control a router, and much confusion
    ensued until I realized he was talking about a tool for gouging
    channels into a piece of wood.

It has always been possible to write scripts to mimic human actions at a
terminal when interacting with these devices, and therefore to automate
tasks that were not designed with automation in mind.

The interesting aspect for us is what happens when our terminal emulator
(and scripting engine) is Unicode-based; for example, it is designed for
UTF-8 only. In this case, the ASCII prompts and other outputs from the
device would pass through transparently but for the combining character
issue.

But by now we have discussed this to death (until next time :-)  I agree
with Rick, who said "I think many of the problems would evaporate if the
Linux people actually sat down and tried to do new-fangled "terminal
emulation" from scratch, assuming Unicode with combining marks, and not
assuming the fixed-width char-per-glyph kind of columnar space that worked
in the past for ASCII." At least in spirit -- I don't know about the
fixed-width part. The point of a terminal, as distinct from a Web browser
or word processor (etc), is the fixed-width aspect. I can send you email
(as I am doing now, with my medieval text-based email client) with every
expectation that it will look the same to you as it does to me, even if
it includes tables, source code, or anything else -- character-set issues
aside. But that's another discussion (we've had it before).

I hope I can tackle some aspects of this problem myself soon, if only to
see the consequences on a low-tech, low-power platform of executing the
per-character database lookups, decompositions, and normalizations in a
realtime telecommunications setting. I don't suppose anyone else already
has some experience with this?

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT