L2/08-419

Date: Thu, 06 Nov 2008 10:45:24 -0800
From: Asmus Freytag
Subject: Specifying Boustrophedon in the Unicode Context

For UTC and bidi subcommittee information.

This comes from a discussion on the Unicode list on boustrophedon (alternate
lines alternating writing direction LTR/RTL) originally discussed in the
context of Old Hungarian. In that discussion it became clear to me that
Unicode's "benign neglect" of boustrophedon as a writing mode is entirely
too benign. It causes uncertainty on where this fits in an encoding and
rendering architecture in which, for example, scripts have pre-assigned
default directionality and bidirectional reordering and mirroring is
*mandatory* for conformant processes.

This is not written like a proposal, but the last section could be turned
into a proposal by anyone interested in making this move forward.

A./

-------- Original Message --------
Subject:     Re: Boustrophedon
Date:     Wed, 05 Nov 2008 21:54:33 -0800
From:     Asmus Freytag <asmusf@ix.netcom.com>



On 11/4/2008 6:07 AM, Doug Ewell wrote:
> Q: Why is this thread like boustrophedon itself?
> A: Because it goes in two different directions.
>
> It would be really neato if we could split the "principles of
boustrophedon" discussion off into a separate thread...

Q: Why does Unicode's bidi specification intersect with rendering
ancient scripts?

A: The bidi specification has two parts, an implicit part and an explicit
 part. The implicit part handles  whether a script is by default RTL or
LTR, and the interaction with *shared* punctuation marks and modern digits.
The explicit part is where the author can assign directionality to a run of text.

Because of the needs of modern bidi scripts, these mechanisms are widely
supported in tools and on the web. If you want to publish ancient texts,
and if you can fit into these existing solutions, then you're done (plus
or minus having a font available for yourself and recipients of non-final
form documents, such as HTML).

HTML, CSS etc. take over part of what the explicit part of the bidi
specification can do, and replace it with commands in their own syntax.
Generally, the results could still be copied out as plain text, preserving
the bidi settings as explicit bidi format characters.

Q: Do we need a boustrophedon setting for HTML, CSS etc?

A: The problem with using existing bidi formatting for boustrophedon
is that you lose automatic line wrapping. You have to decide ahead of
time where your line ends, and insert format characters to define the
directionality of the following line. That may be fine for *exactly*
representing ancient originals, but it's a poor approach for showing
general texts in a matter that preserves the spirit of boustrophedon,
but works in the modern typographic environment.

Honestly, it is unclear at this moment how important that functionality
is - but unless it becomes a part of these protocols (and widely
supported) it will be difficult to widely disseminate such texts formatted
in this manner.

Q: Why is the specification of bidi/mirroring important?

A: The way mirroring is defined in bidi is all-inclusive, with the
few exceptions limited to LTR scripts shown as RTL (exactly 1/2 of
the possible range of boustrophedon cases). Any implementation of a
protocol or rendering system that's bidi conformant would be unable to
provide boustrophedon support for RTL scripts without violating that
conformance. That makes widespread support even more difficult.

Q: What should be done?

A: People caring about ancient scripts should file a UTC paper requesting
a) broadening of the existing exception to cover RTL scripts shown in
LTR direction via overrides
b) alternative: add a specification for boustrophedon (see below)
c) work with W3C to support the necessary modes in HTML, CSS, etc.


Q: What would be a reasonable *Unicode* specification for boustrophedon?

A: First, it would would deliberately be very *minimal*. It would mostly point
out where boustrophedon fits into the encoding and rendering architecture
defined by Unicode, and how it is to interact with the bidi
algorithm, but would not define the detailed behaviors.
It would have these elements
a) boustrophedon is a permissible higher level protocol in terms of bidi
b) when active, details of mirroring can be overridden by the protocol
that defines when and how boustrophedon is active
c) when boustrophedon is active, normal bidi behavior applies, but relative
to the direction of the current line (what was LTR is now downstream and
what was RTL is now upstream). This preserves a definite meaning for the
bidi format characters

In other words, true boustrophedon can be implemented consistently after bidi
evaluation and reordering, by flipping every alternate line. The normal line
 breaking can be performed. The details of how to designate a run of text as
boustrophedon and whether the first line is RTL or LTR, or which runs are
exceptionally mirrored, would be outside the scope of the Unicode standard.
(And could then be handled cleanly by HTML, CSS etc).

The normal bidi format characters are used to override the bidi behavior
*before* boustrophedon is applied.
By pushing the question of defining the text section(s) to which boustrophedon
applies off to the protocol, no new scoping is introduced
on the plain text level.

Q: Do I expect anyone from that community to actually propose a technical specification?

A: I'm not holding my breath.

A./