Re: Need for Level Direction Mark

From: Peter Edberg <pedberg_at_apple.com>
Date: Sun, 18 Sep 2011 20:21:38 -0700

Richard,

On Sep 17, 2011, at 7:24 PM, Richard Wordingham wrote:

> On Fri, 16 Sep 2011 18:59:47 -0700
> Peter Edberg <pedberg_at_apple.com> wrote:
>
> I'll take this argument first.
>
>> At any rate, it seems that if LDM-like behavior is needed, there is
>> no alternative using existing controls. As Kent Karlsson says in the
>> e-mail discussion, "All the workarounds w.r.t. LDM depend on the
>> directionality of neighbouring characters, not directly on the
>> embedding level direction. Therefore I think none of them will work
>> properly in all cases (even though they may give the seemingly
>> correct result in many cases)." Either we decide that this behavior
>> is beyond the scope of the UBA, or we decide on one of the options
>> presented (or come up with another).
>
> I have now demonstrated to my satisfaction that text with LDM can be
> converted to text without LDM that should display the same, under the
> following debatable assumptions:
>
> (A) The remaining neutrals prior to the application of Rule W7 in the
> UBA do not ligate or kern with non-neutrals.
>
> (B) Non-displaying runs embedded within other runs have no effect on
> the display.
>
> I can make the conversion tables available on request.

I would like to see these tables, thanks (you can send them to me off-list).

>> Second, responses to some of the suggestions/comments:
>>
>> 1. Richard Wordingham suggested that for the Arabic date example
>> (dd/MM/yyyy), surrounding the '/' with RLM before and LRM after works
>> as well as using LDM before the '/'. <snip>
>>
>> However, it does not handle the situation in which the date is part
>> of other text, and may be preceded or followed by Arabic letters
>> (with an intervening space); there are layout interactions between
>> the Arabic letters and adjacent Arabic digits, since the digits are
>> not treated as being part of a longer sequence due the direction
>> marks associated with the '/'. This can be solved by placing an LDM
>> before and after the date, as well as before each '/'. However, using
>> an RLM LRM sequence before and after the date causes the spaces
>> around the date to reorder.
>
> The interaction between Arabic letters and Arabic digits that are part
> of the date occurs in a left-to-right embedding. The non-LDM solution
> in each case is to insert an LRM after the separating space.
>
> Incidentally, if Arabic digits (AN, not EN) are used, the separators
> should be terminated by LRM on one side, not by both RLM and LRM.

I am not sure exactly what you are suggesting here. Do you meant just the following (in memory order):
  LRM AN+ '/' LRM AN+ / LRM AN+
? If so, that will not lay out correctly in a right-to-left context.

> The disadvantage of not having LDM is that the alternative rules are
> complex - I had to refer to my LDM-removal tables to quickly find the
> right steps to take. European digits (EN) are extremely complicated,
> as one has to consider the preceding strong character - L, R, AL or LDM.

Yes, I can see this could get complex. The advantage of LDM is that in many cases it can be used without much awareness of or tailoring for the specific content with which it will be used.

The disadvantage, of course, is the difficulty of integrating with the existing UBA. So it may turn out to be something that is added only if we go to a UBA v2 for other reasons as well.

>> Furthermore, for the example in UAX #9 section 5.6, using RLM and LRM
>> around the '-' causes reordering of the adjacent spaces, while using
>> LDM before each '-' solves the layout problem.
>
> Of course, the problem of spaces is cured if one uses <RLM, SP,
> HYPHEN-MINUS, SP, LRM> as the bounding delimiter.
>
> Richard.

- Peter E
Received on Sun Sep 18 2011 - 22:24:50 CDT

This archive was generated by hypermail 2.2.0 : Sun Sep 18 2011 - 22:24:51 CDT