Re: Need for Level Direction Mark from Philippe Verdy on 2011-09-19 (Unicode Mail List Archive)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Tue, 20 Sep 2011 01:48:45 +0200

2011/9/20 Richard Wordingham <richard.wordingham_at_ntlworld.com>:
> On Mon, 19 Sep 2011 05:44:27 +0200
> Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:
>
>> 2011/9/19 Peter Edberg <pedberg_at_apple.com>:
>
>> > <snip> The whole point
>> > of LDM was to be able to create semi-structured elements such as
>> > the example in UAX #9 section 5.6 *without* knowing in advance
>> > the direction context in which the element would be used.
>>
>> You absolutely don't need to know in advance the direction of context
>> before using LRE..PDF or RLE..PDF. It will work in both directions,
>> ordering and separating the fields in the same order as this context.
>> So yes LRE..PDF and RLE..PDF create a semi-structure, which does fit.
>
> Actually, no. A sequence <LRE embedded_1 PDF N LRE embedded_2 PDF>
> will result in N being resolved as L by rule N1.
> embedded_1 will display to the left of embedded_2 whatever the
> context. You still need something like <LRE embedded_1 PDF RLM N LRE
> embedded_2 PDF> to force embedded_1 to display before embedded_2
> whichever the directionality of the embedding within which these
> occur. (DLM could substitute for RLM and avoid untidily placed
> non-rendering runs.) Alternatively, you could alternate LRE...PDF and
> RLE...PDF.

That's exactly the case where I think that rule N1 is incorrect if the
overall embedding level is RTL.

And not that we were speaking about CS separators, not N separators.

I'm not advocating changes or addition in the Bidi classes, but a
correction to such rule (specifically for the behavior of PDF and what
happens after it which should not depend on the character before PDF,
but on the character before LRE or RLE).

Because it also has practical applications (for example look at the
currenct Wikimedia bug when it wants to display lists of category
names, and insert a separator between them: there's no reliable
solution for now to make it work for now using spans with CSS
bidi-control properties, when the category names can alternate between
Arabic and Latin. And there are also undesirable consequences on
mirroring.

> You also need extra marks to avoid the structure sucking in adjacent
> elements - you need either
>
> <RLM LRE embedded_1 PDF RLM N LRE embedded_2 PDF RLM>
>
> or
>
> <DLM LRE embedded_1 PDF DLM N LRE embedded_2 PDF DLM>

Which is really overkill. LRE and RLE are supposed to completely embed
and mask the effective direction of their content, so that the initial
weak context is fully restored by PDF and applies to the content after
it (whatever its Bidi class).

I think it's better to correct the UBA to get the expected full
restoration of context by PDF, rather than adding a new LDM (and an
associated new class) which will still require a change in the UBA to
be effective, and that will also break the Unicode stability rule.

This is a bug in the resolution step of the the UBA. Nothing should
change when the content in each LRE..PDF or RLE..PDF is being
internally resolved (partly, because there will remain weak directions
in that context, for which you'll need to infer the actual direction
from the text before the embedded section, without this forcing the
rest of the content after the embedded section to also depend on how
the embedded section was internally resolved).

The problem comes in fact from the fact that the UBA removes the Bidi
controls too soon (and in fact it is not even necessary, except may be
after the very last step).
Received on Mon Sep 19 2011 - 18:52:31 CDT

This archive was generated by hypermail 2.2.0 : Mon Sep 19 2011 - 18:52:38 CDT