Re: BidiMirrored property and ancient scripts

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Mon, 27 Jul 2015 15:32:01 +0100

On Sun, 26 Jul 2015 18:08:00 +0300
Eli Zaretskii <eliz_at_gnu.org> wrote:

> > Date: Sat, 25 Jul 2015 22:15:40 +0100
> > From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
> >
> > > > Mirroring is changing a glyph to suitable for reading in the
> > > > other direction.
> > >
> > > Sorry, I disagree.
> > >
> > > > Note the following extract from BidiMirroring.txt in the
> > > > Unicode Character Database:
> > > >
> > > > <quote>
> > > > # The following characters have no appropriate mirroring
> > > > character. # For these characters it is up to the rendering
> > > > system # to provide mirrored glyphs.
> > >
> > > How's that a contradiction to what I said?
> >
> > U+2140 DOUBLE-STRUCK N-ARY SUMMATION gets mirrored, but its glyph is
> > not replaced by any other character's glyph. Or are you claiming
> > that left-to-right U+2140 and right-to-left U+2140 are two different
> > characters?
>
> I'm saying that "providing a mirrored glyph" entails coming up with a
> character whose glyph can play that role, AFAIU.

I'll take that as 'No' - the left-to-right and right-to-left forms are
the same character. (Unicode has no consistency in this matter.)

> If you are saying that the "rendering system" here is the shaping
> engine using the rtlm OTF feature, then you are in fact saying that
> the mirroring of these characters cannot be implemented with most
> fonts in wide use today, and with most shaping engines. That would be
> a very strange claim, IMO, tantamount to saying that those characters
> cannot, or don't need to, be mirrored at all in most use cases.

OpenType can handle it - feature rtlm effectively provides a
supplementary an RTL cmap, and ltrm an LTR cmap. It's conceivable that
DirectWrite and Uniscribe don't support it, but that's unlikely.

It looks as though the HarfBuzz implementation of OpenType also supports
mirroring for right-to-left runs, but I can't find the code subsequent
to tagging characters that weren't reversed using the
Bidi_Mirroring_Glyph property. I have a similar lack of progress with
finding the code for fractions, which also tags characters. Fractions
using U+2044 are supported by HarfBuzz, for all that I can't find the
code.

I can't find any evidence of AAT support.

The OpenType scheme for mirroring for right-to-left text is:

1) Apply Unicode 5.1 Bidi_Mirroring_Glyph property where applicable.

2) For other characters, apply the rtlm feature. This is intended to be
applied character by character.

3) Apply the rtla feature to the resulting glyph sequence.

Note that the font-writer is responsible for determining whether a
character is to be mirrored at Step 2. Also note that there is no need
for font support if all the Bidi mirrored characters it supports have
the Bidi_Mirroring_Glyph property.

There is similar logic for mirroring for left-to-right text, except
that there is no Bidi_Mirroring_Glyph support from Unicode tables. The
decision to mirror is entirely up to the font.

Now, you may be right about font support being lacking, just as it is
often lacking for U+2044 FRACTION SLASH.

If you still don't believe me, please explain why U+222B INTEGRAL has
Bidi_Mirrored=Yes but Bidi_Mirroring_Glyph=<none>.

> > > > > Thus, your reasons make no sense to me, because a character's
> > > > > shape, any character's shape, be it L, R, AL, or anything
> > > > > else, is immutable.
> > > >
> > > > So go back and reread.
> > >
> > > Did that; still no sense.
> >
> > Because you still seem not to understand the concept of mirroring.
>
> I think you will fare much better, and actually stand a chance of
> convincing you are right, if you assume your opponents do understand
> the issues, and just happen to disagree about their interpretation, or
> misinterpret what you write.

You won't understand my reasoning unless you accept that Bidi mirroring
can change a glyph's shape rather than substitute the glyph of another
character. If you don't accept that, my argument will make no sense,
because you don't accept the premisses.

> > It isn't just for characters that have a Bidi_Mirroring_Glyph
> > property value other than <none>.
>
> Only "in specialized contexts", like "historic scripts and associated
> punctuation, private-use characters, and characters in mathematical
> expressions" (I believe the latter is only happening in Arabic
> context, if it ever does). IOW, in extremely rare and marginal use
> cases. And all that is only in HL6, which is really a fire escape
> meant for applications whose scope is beyond simple text.

L4 calls for mandatory 'mirroring'. Note that mirroring is not exact
mirroring. My interpretation works for both Arabic and Hebrew. The
UBA Rule L4 calls for some mathematical symbols to take the form
appropriate for a right-to-left context. (HL6 allows this set
to be extended.) However, from what you say this form depends on the
language. For example, the basic integral sign flips for Arabic maths,
but from what you say, I think not for Hebrew maths. OpenType can make
the mirrored shaped dependent on the language of the text.

> That's a
> far cry from boustrophedon, which was the trigger for most of this
> exchange. In all other cases:
>
> L4. A character is depicted by a mirrored glyph if and only if (a)
> the resolved directionality of that character is R, and (b) the
> Bidi_Mirrored property value of that character is Yes.
>
> That's normative and unequivocal.

And therefore applies to U+222B INTEGRAL. Formally, HL6 is
irrelevant for this character. Now, you might wish for HL6 to be
modified to allow it not to be mirrored, but I think we can stretch the
definition of mirroring to handle it.

UBA Section 7 "Mirroring" says:

"Implementing rule L4 calls for mirrored glyphs. These glyphs may not be
exact graphical mirror images. For example, clearly an italic
parenthesis is not an exact mirror image of another— “(” is not the
mirror image of “)”. Instead, mirror glyphs are those acceptable as
mirrors within the normal parameters of the font in which they are
represented."

This opens up the possibility of the degree of mirroring depending on
the language being supported.

> > > > > > However, one needs the UBA to sort out the rendering of the
> > > > > > parentheses in the Hebrew text.
> > > >
> > > > > Not really, you can short-cut it, the same as in strictly
> > > > > left-to-right text.
> > > >
> > > > It's the UBA that mandates that the opening and closing
> > > > parentheses be rendered like right and left parentheses
> > > > respectively rather than like left and right parentheses.
> > >
> > > Mirroring comes after layout in the UBA, as you pointed out, and
> > > the short-cuts I mentioned are about layout, not about mirroring.
> >
> > So irrelevant.
>
> No, not irrelevant. You can sort out rendering of parentheses in such
> text without applying the BPA, just by considering the parentheses as
> neutrals. That's one shortcut I alluded to.

> > I take it we now agree that the right shape for the parentheses for
> > the unidirectional right-to-left example is derived by the UBA.

> The mirroring is dictated by the UBA, yes.

Which was my point - the UBA applies to unidirectional text.

> But that just delineates
> the difference between boustrophedon and bidirectional text, the
> latter being subject to the UBA, while the former isn't.

I didn't say boustrophedon text was subject to the UBA. I said a
boustrophedon renderer may modify the text to be rendered so that the
UBA will layout the text properly. This modification is heavily
dependent on line length. Ideally one would lay it out line-by-line.

> > > > > > Indeed, one may rely on the bidi algorithm to declare the
> > > > > > Latin example unidirectional.
> > > > >
> > > > > One might, but to what purpose and goal?
> > > >
> > > > A right-to-left paragraph consisting of the two characters "(a"
> > > > would be bidirectional and have a parenthesis on the right; a
> > > > left-to-right paragraph with the same content would have a
> > > > parenthesis on the left.
> >
> > If there is no higher-level protocol in effect, the 'first strong
> > character' rule (Rules P2 and P3 of the UBA) declares that the
> > paragraph will be a left-to-right paragraph and will look
> > like "(a". Had it been declared a right-to-left paragraph by a
> > higher-level protocol, it would look like "a)". Thus the UBA has a
> > rôle even for unidirectional left-to-right text.

> Once the paragraph direction is overridden by a higher-level protocol,
> the text is no longer unidirectional. Such overriding is equivalent
> to enclosing the paragraph in RLE..PDF pair, which makes the text
> bidirectional by definition.

And if it isn't overridden, it is the UBA which makes it
unidirectional. The UBA specifies the appearance of an opening
parenthesis.

Richard.
Received on Mon Jul 27 2015 - 09:33:42 CDT

This archive was generated by hypermail 2.2.0 : Mon Jul 27 2015 - 09:33:43 CDT