Re: Bidi paragraph direction in terminal emulators

From: Asmus Freytag via Unicode <unicode_at_unicode.org>
Date: Sat, 9 Feb 2019 11:12:03 -0800
On quick reading this appears to be a strong argument why such emulators will
never be able to be used for certain scripts. Effectively, the model described works
well with any scripts where characters are laid out (or can be laid out) in fixed
width cells that are linearly adjacent.

There are some crude techniques that allow an extension to cover scripts that
require half-width or double-width cells, and perhaps even zero-width.

However, scripts, where rendering involves complicated ligatures or other
typographical interactions that often are specific to a given font, would simply
be out of scope because for those scripts the fixed width model with an
underlying buffer mimicking the display simply cannot be made to work.

And indeed, by up-front accepting the limitation of a particular design approach
it would be surprising if such emulators proved flexible enough to handle the
rather wide variety of writing systems supported by Unicode.

At best, the discussion could yield a few further approximations of correct
rendering that can be retrofitted to the particular design restrictions outlined
below, but that with luck extend the envelope somewhat so that a few more
writing systems can be shoehorned into it.

However, it appears quite hopeless to attempt to cover all of Unicode's scripts
on that premise.

A./




On 2/9/2019 10:25 AM, Egmont Koblinger via Unicode wrote:
On Sat, Feb 9, 2019 at 7:07 PM Eli Zaretskii <eliz@gnu.org> wrote:

You need to use what HarfBuzz tells you _instead_ of wcswidth.  It is
in general wrong to use wcswidth or anything similar when you use a
shaping engine and support complex script shaping.
This approach is not viable at all.

Terminal emulators have an internal data structure that they maintain,
a matrix of character cells. Every operation is performed here, every
escape sequence is defined on this layer what it does, the cursor
position is tracked on this layer, etc. You can move the cursor to
integer coordinates, overwrite the letter in that cell, and do plenty
of other operations (like push the rest to the right by one cell). If
you change these fundamentals, most of the terminal-based applications
will fall apart big time.

This behavior has to be absolutely independent from the font. The
application running inside the terminal doesn't and cannot know what
font you use, let alone how harfbuzz is about to render it. (You can
even have no font at all, such as with the libvterm headless emulator
library, or a detached screen or tmux session; or have multiple fonts
at the same time if a screen or tmux session is attached from multiple
graphical emulators.)

So one part of a terminal emulator's code is responsible for
maintaining this matrix of characters according to the input it
receives. Another part of their code is responsible for presenting
this matrix of characters on the UI, doing the best it can.

If you say that the font should determine the logical width, you need
to start building up something brand new from scratch. You need to
have something that doesn't have concepts like "width in characters".
You need to redefine cursor movement and many other escape sequences.
You need to heavily adjust the behavior of a gazillion of software,
e.g. zip's two-column output, anything that aligns in columns (e.g.
midnight commander, tmux's vertical split etc.), the shell's (or
readline's) command editing and wrapping to multiple lines, ncurses,
and so on, all the way to e.g. fullscreen text editors like Emacs.

And then we're not talking about terminal emulators anymore, as we
know them now, but something new, something pretty different.

Terminal emulators do have strong limitations. Complex text rendering
can only work to the extent we can squeeze it into these limitations.


cheers,
egmont


Received on Sat Feb 09 2019 - 13:12:09 CST

This archive was generated by hypermail 2.2.0 : Sat Feb 09 2019 - 13:12:09 CST