Re: bidi support for xterm

From: Edward Cherlin (
Date: Sun Aug 15 1999 - 03:32:34 EDT

At 03:48 -0700 8/14/1999, Markus Kuhn wrote:

>However, mere implementations of the Unicode bidi algorithm are far from
>what we need to really understand how to handle bidi text in xterm or
>other VT100/ISO 6429 emulators.

The basic Bidi algorithm in the monospaced context takes a sequence of
Unicode characters as input and gives back something like a rectangular
array (matching the size of the display) of glyphs, or an equivalent list
of lines in visual order. This is easy to render.

The Bidi functions we look for in an editor should do differential display
update, so that only the changed character positions are affected on the
screen. Insert, delete, and replace are fairly easy within a directional
run, as long as they don't spill over to the next line. It takes a bit more
work when the cursor crosses a direction boundary, or when a word is pushed
to the next line or pulled back up as text lengthens and shortens. In the
general case, where multiple boundaries are crossed in a single command, it
may be easier to render the paragraph again from scratch.

>Xterm, like any VT100 emulator, is NOT
>just a receiver of a stream of Unicode plaintext. It is a rendering
>engine that places glyphs onto a character cell matrix, and the received
>stream of Unicode characters is mixed with a huge number of different
>control sequences for positioning the cursor, scrolling parts of the
>screen, deleting parts of the screen, etc., whose semantics in the
>context of the Unicode bidi algorithm are extremely unclear (at least to

Me, too, since they have no definitions other than their behavior on
screen. Are editor functions that closely identified with terminal
controls? Then we must choose whether to keep that mapping, or whether to
implement editor functions that deal with Bidi as users of Bidi expect. Or
implement a set of each, and let users choose.

>We have to worry about full-screen editors such as vi or mined
>which interact with xterm in a very intimate way in order to provide
>with the user an intuitive editing functionality. If I tell xterm to
>position the cursor into some hebrew text and then send the
>delete-end-of-line ESC sequence, is xterm supposed to delete to the left
>or to the right?

Sorry, this question turns out not to fit the context.

Using the obvious codes for LTR, RTL, and Insertion point,
delete-end-of-line removes the characters in the marked positions from the
following line, shown in visual order.

            4321 56789

This is simply the forward direction in the text--leftward to the end of
the leftward run, then rightward in the enclosing rightward run. You must
learn to think of forward and backward, not leftward and rightward. Then
convincing the software to think that way shouldn't be too hard. :-)

>What should the backspace control code do on the screen
>when it passes through mixed hebrew/latin text?

Well, this is where the convincing comes in. Some part of the software has
to interpret the Unicode semantics, and translate the results of editor
commands to sequences of terminal commands that create the right display.
Unless you want editor commands that act like terminal commands on the
display, and make the software figure out what sequence of Unicode
characters would produce that arrangement of glyphs.

>I think for xterm the higher priority projects should be biwidth fonts
>(for CJK) and combining characters (for Thai, phonetic alphabet, etc.),
>which seems to be of manageable complexity. I have no idea, how a
>practical convention for the interaction of full-screen editors with
>xterm whould look like, if xterm tried somehow to implement the Unicode
>bidi algorithm, and I challenge anyone who urgently wants to have the
>bidi algorithm in xterm to write up a detailed proposal that explains
>how this should work precisely.

It isn't hard to state the principles. I can't give you a detailed proposal
since I don't know what editor and terminal command sets you want to
harmonize, nor whether you want to keep terminal function semantics or to
follow the logic of Unicode in extending editor functions. I would be
happy to see a statement of principles and the corresponding set of
functions that need to be defined, and I would assist any effort to design
such software.


>Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
>Email: mkuhn at, WWW: <>

Edward Cherlin
"It isn't what you don't know that hurts you, it's
what you know that ain't so."--Mark Twain, or else
some other prominent 19th century humorist and wit

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT