Re: Need for Level Direction Mark

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Sun, 18 Sep 2011 03:24:43 +0100

On Fri, 16 Sep 2011 18:59:47 -0700
Peter Edberg <pedberg_at_apple.com> wrote:

I'll take this argument first.

> At any rate, it seems that if LDM-like behavior is needed, there is
> no alternative using existing controls. As Kent Karlsson says in the
> e-mail discussion, "All the workarounds w.r.t. LDM depend on the
> directionality of neighbouring characters, not directly on the
> embedding level direction. Therefore I think none of them will work
> properly in all cases (even though they may give the seemingly
> correct result in many cases)." Either we decide that this behavior
> is beyond the scope of the UBA, or we decide on one of the options
> presented (or come up with another).

I have now demonstrated to my satisfaction that text with LDM can be
converted to text without LDM that should display the same, under the
following debatable assumptions:

(A) The remaining neutrals prior to the application of Rule W7 in the
UBA do not ligate or kern with non-neutrals.

(B) Non-displaying runs embedded within other runs have no effect on
the display.

I can make the conversion tables available on request.

> Second, responses to some of the suggestions/comments:
>
> 1. Richard Wordingham suggested that for the Arabic date example
> (dd/MM/yyyy), surrounding the '/' with RLM before and LRM after works
> as well as using LDM before the '/'. <snip>
>
> However, it does not handle the situation in which the date is part
> of other text, and may be preceded or followed by Arabic letters
> (with an intervening space); there are layout interactions between
> the Arabic letters and adjacent Arabic digits, since the digits are
> not treated as being part of a longer sequence due the direction
> marks associated with the '/'. This can be solved by placing an LDM
> before and after the date, as well as before each '/'. However, using
> an RLM LRM sequence before and after the date causes the spaces
> around the date to reorder.

The interaction between Arabic letters and Arabic digits that are part
of the date occurs in a left-to-right embedding. The non-LDM solution
in each case is to insert an LRM after the separating space.

Incidentally, if Arabic digits (AN, not EN) are used, the separators
should be terminated by LRM on one side, not by both RLM and LRM.

The disadvantage of not having LDM is that the alternative rules are
complex - I had to refer to my LDM-removal tables to quickly find the
right steps to take. European digits (EN) are extremely complicated,
as one has to consider the preceding strong character - L, R, AL or LDM.

> Furthermore, for the example in UAX #9 section 5.6, using RLM and LRM
> around the '-' causes reordering of the adjacent spaces, while using
> LDM before each '-' solves the layout problem.

Of course, the problem of spaces is cured if one uses <RLM, SP,
HYPHEN-MINUS, SP, LRM> as the bounding delimiter.

Richard.
Received on Sat Sep 17 2011 - 21:30:47 CDT

This archive was generated by hypermail 2.2.0 : Sat Sep 17 2011 - 21:30:49 CDT