L2/08-418 Date: Thu, 06 Nov 2008 10:23:12 -0800 From: Asmus Freytag Subject: More bidi issues - mirroring Recent discussion on the Unicode list has uncovered shortcomings with the specifications of Mirroring in UAX#9. It's desirable to allow character mirroring for ancient scripts when rendered in the opposite direction using overrides. That's how these scripts where written. The current spec only allows this for scripts defined as LTR, but there are some for which RTL as a default directionality may be desired. In that case, the language in UAX#9 would need to be adjusted. Proposed language is found in the remainder of this document (in quotes). This change affects permissible protocols, in a context that doesn't impact "standard" bidi scenarios. Ordinarily, no impact on existing implementations is to be expected. Also, the explanation of mirrored italic characters could be improved. Proposed language is found at the end of this document (in quotes). This is editorial and intended to clarify a point on which there's apparently some confusion. See attached messages. A./ ------------- Message 1: On 11/4/2008 1:14 AM, Michael Everson wrote: > On 4 Nov 2008, at 08:09, Kent Karlsson wrote: > >>>> I know. That is why there is a loophole, but only a loophole, >>>> in the bidi algorithm. It covers (badly) Old Italic, which is >>>> encoded as LTR, when overridden as RTL to (sometimes, only >>>> sometimes, using some as yet unheard of higher-level-protocol >>>> mechanism; I can imagine them, just haven't seen any) produce >>>> mirrored glyphs for the Old Italic letters. >>> >>> "The font" is supposed to do that. We've always been told this. >> >> But if the bidi algorithm says not to mirror, the font [handling >> system] should not start to mirror. The mirroring is governed by the >> bidi algorithm. If the font mirrors glyphs anyway, it's simply a >> flawed font. > > You say this with a lot of conviction. I don't think you're right though. Just read the standard, in this case UAX#9. The Bidirectional Conformance section says, removing language not needed for this discussion: UAX9-C1. In the absence of a permissible higher-level protocol, a process that renders text shall display all visible representations of characters (excluding format characters) in the order described by Section 3, Basic Display Algorithm, of this annex. In particular, this includes ....L4 UAX-C2. The only permissible higher-level protocols are those listed in Section 4.3, Higher-Level Protocols They are ... and HL6 and L4 is the step that defines mirroring. L4. A character is depicted by a mirrored glyph if and only if (a) the resolved directionality of that character is R, and (b) the Bidi_Mirrored property value of that character is true. * This rule can be overridden in certain cases; see HL6 and, finally, HL6 says: HL6.Additional mirroring. * Characters with a resolved directionality of R that do not have the Bidi_Mirrored property can also be depicted by a mirrored glyph in specialized contexts. Such contexts include, but are not limited to, historic scripts and associated punctuation, private-use characters, and characters in mathematical expressions. (See Section 6, Mirroring As you can see, the case where an RTL script is overridden to L is *not* covered by this language. Conformant process MUST NOT mirror characters in this case. What the UAX should have been saying is: "Characters with a resolved directionality of R, or characters defined in the standard with bidirectional class of R and resolved directionality of L, can also...." But, unless it's fixed, that UAX is *not* saying that and Kent's correct. A./ --------------- Message 2: On 1132008 11:54 PM, Kent Karlsson wrote: > While Unicode does not make this explicit, there is an expectation that > "truly" paired symmetric-ish punctuation (parentheses, brackets, similar, > but NOT quote marks) need to be mirrored by using the opposite character > in the pair. Otherwise italics will look funny. > > That's not what the standard says. The text in section 6 Mirroring of UAX#9 states: Implementing rule L4 <#L4> calls for mirrored glyphs. These glyphs may not be exact graphical mirror images. For example, clearly an italic parenthesis is not an exact mirror image of another— "(" is not the mirror image of ")". Instead, mirror glyphs are those acceptable as mirrors within the normal parameters of the font in which they are represented. The formulation is a bit unfortunate. Instead of stating: Instead, mirror glyphs are those acceptable as mirrors within the normal parameters of the font in which they are represented. what should have been stated is something like this: "Instead, mirror glyphs are the proper shape of the character when rendered for the opposite text direction, given the design of the font in which they are represented." A./