Re: A basic question on encoding Latin characters

From: Geoffrey Waigh (
Date: Tue Sep 28 1999 - 21:19:03 EDT

Robert Brady wrote:
> Saying "abandon terminals" is not really an adequate solution either, and
> certainly not to UNIX users. Using decomposed characters in the input
> stream breaks things that were possible before. Sure, our metaphor
> couldn't cope with distinguishing "ch" from "c", but one thing it could do
> was distinguish "a-with-ring-above" with "a". We can't do that with things
> normalised to postfix combining characters.

Ah but they are not saying abandon terminals. They are saying you will
probably have to abandon systems designed around the vt100 model or do
a whole bunch of painful work. Various terminals in history have made
fundamental design decisions to simplify their hardware/firmware by
assuming that only a particular character encoding would ever appear in
that system. Some of those character encodings made hideous assumptions
that a particular user community could live with. The problem is the
world should not revolve around making life easy for terminal emulator
implementors, but in actually providing good data communications.

Extending the hacks of the past as far as they can go in the hopes that
the places they won't reach are not important is a poor design philosophy.
For years Unicode has been making it clear that text processing on a byte-
per-byte level is complex and repetitious enough to belong in an abstract
library. Programs that want to handle text data *really* should be revised
to work with strings of text as the logical unit and if the program has
inherent design assumptions that preclude it, the program should be
redesigned. Yes it is a pile of work codewise and designwise. On the
other hand your software will then be fairly well insulated from the
next bizarre writing system requirements unearthed by archeo-linguists.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT