L2/08-419 Date: Thu, 06 Nov 2008 10:45:24 -0800 From: Asmus Freytag Subject: Specifying Boustrophedon in the Unicode Context For UTC and bidi subcommittee information. This comes from a discussion on the Unicode list on boustrophedon (alternate lines alternating writing direction LTR/RTL) originally discussed in the context of Old Hungarian. In that discussion it became clear to me that Unicode's "benign neglect" of boustrophedon as a writing mode is entirely too benign. It causes uncertainty on where this fits in an encoding and rendering architecture in which, for example, scripts have pre-assigned default directionality and bidirectional reordering and mirroring is *mandatory* for conformant processes. This is not written like a proposal, but the last section could be turned into a proposal by anyone interested in making this move forward. A./ -------- Original Message -------- Subject: Re: Boustrophedon Date: Wed, 05 Nov 2008 21:54:33 -0800 From: Asmus Freytag On 11/4/2008 6:07 AM, Doug Ewell wrote: > Q: Why is this thread like boustrophedon itself? > A: Because it goes in two different directions. > > It would be really neato if we could split the "principles of boustrophedon" discussion off into a separate thread... Q: Why does Unicode's bidi specification intersect with rendering ancient scripts? A: The bidi specification has two parts, an implicit part and an explicit part. The implicit part handles whether a script is by default RTL or LTR, and the interaction with *shared* punctuation marks and modern digits. The explicit part is where the author can assign directionality to a run of text. Because of the needs of modern bidi scripts, these mechanisms are widely supported in tools and on the web. If you want to publish ancient texts, and if you can fit into these existing solutions, then you're done (plus or minus having a font available for yourself and recipients of non-final form documents, such as HTML). HTML, CSS etc. take over part of what the explicit part of the bidi specification can do, and replace it with commands in their own syntax. Generally, the results could still be copied out as plain text, preserving the bidi settings as explicit bidi format characters. Q: Do we need a boustrophedon setting for HTML, CSS etc? A: The problem with using existing bidi formatting for boustrophedon is that you lose automatic line wrapping. You have to decide ahead of time where your line ends, and insert format characters to define the directionality of the following line. That may be fine for *exactly* representing ancient originals, but it's a poor approach for showing general texts in a matter that preserves the spirit of boustrophedon, but works in the modern typographic environment. Honestly, it is unclear at this moment how important that functionality is - but unless it becomes a part of these protocols (and widely supported) it will be difficult to widely disseminate such texts formatted in this manner. Q: Why is the specification of bidi/mirroring important? A: The way mirroring is defined in bidi is all-inclusive, with the few exceptions limited to LTR scripts shown as RTL (exactly 1/2 of the possible range of boustrophedon cases). Any implementation of a protocol or rendering system that's bidi conformant would be unable to provide boustrophedon support for RTL scripts without violating that conformance. That makes widespread support even more difficult. Q: What should be done? A: People caring about ancient scripts should file a UTC paper requesting a) broadening of the existing exception to cover RTL scripts shown in LTR direction via overrides b) alternative: add a specification for boustrophedon (see below) c) work with W3C to support the necessary modes in HTML, CSS, etc. Q: What would be a reasonable *Unicode* specification for boustrophedon? A: First, it would would deliberately be very *minimal*. It would mostly point out where boustrophedon fits into the encoding and rendering architecture defined by Unicode, and how it is to interact with the bidi algorithm, but would not define the detailed behaviors. It would have these elements a) boustrophedon is a permissible higher level protocol in terms of bidi b) when active, details of mirroring can be overridden by the protocol that defines when and how boustrophedon is active c) when boustrophedon is active, normal bidi behavior applies, but relative to the direction of the current line (what was LTR is now downstream and what was RTL is now upstream). This preserves a definite meaning for the bidi format characters In other words, true boustrophedon can be implemented consistently after bidi evaluation and reordering, by flipping every alternate line. The normal line breaking can be performed. The details of how to designate a run of text as boustrophedon and whether the first line is RTL or LTR, or which runs are exceptionally mirrored, would be outside the scope of the Unicode standard. (And could then be handled cleanly by HTML, CSS etc). The normal bidi format characters are used to override the bidi behavior *before* boustrophedon is applied. By pushing the question of defining the text section(s) to which boustrophedon applies off to the protocol, no new scoping is introduced on the plain text level. Q: Do I expect anyone from that community to actually propose a technical specification? A: I'm not holding my breath. A./