Re: PRI #231: Bidi Parenthesis Algorithm

From: Konstantin Ritt <>
Date: Thu, 7 Jun 2012 13:06:04 +0300

Yep, forgot to mention that the difference is in that that some paired
quotation characters might be used alone in place of apostrophe, etc.
so that the BPA rules could be relaxed for the quotation marks.
Dunno about their mirroring in all languages. I thought the
BidiMirroring.txt is supposed to list a (language-independent)
characters and their respective mirrored brothers.

UAX#24 section 2.2 "Handling Characters with the Common Script Property" states:
> In determining the boundaries of a run of text in a given script, programs must resolve any of the special script property values, such as Common, based on the context of the surrounding characters. A simple heuristic uses the script of the preceding character, which works well in many cases. However, this may not always produce optimal results. For example, in the text “... gamma (γ) is ...”, this heuristic would cause matching parentheses to be in different scripts.
> Generally, paired punctuation, such as brackets or quotation marks, belongs to the enclosing or outer level of the text and should therefore match the script of the enclosing text. In addition, opening and closing elements of a pair resolve to the same script property values, where possible. The use of quotation marks is language dependent; therefore it is not possible to tell from the character code alone whether a particular quotation mark is used as an opening or closing punctuation. For more information, see Section 6.2, General Punctuation, of [Unicode].
> Some characters that are normally used as paired punctuation may also be used singly. An example is U+2019 right single quotation mark, which is also used as apostrophe, in which case it no longer acts as an enclosing punctuation. An example from physics would be <ψ| or |ψ>, where the enclosing punctuation characters may not form consistent pairs.

IIUC, this is the same problem like the one PRI #231 is intended to solve.

For the cases like "a«b»" one would expect similar results provided by
the UBA and the script itemization.


2012/6/7 Philippe Verdy <>:
> Their pairing and mirroring is not appropriate for all languages using them.
> 2012/6/7 Konstantin Ritt <>:
>> Actually, they have a respective entries in the BidiMirroring.txt:
>> and mapped into gc=Pi and gc=Pf.
>> Even without the per-language tailoring, it seems like a good basic
>> approximation, no?
Received on Thu Jun 07 2012 - 05:08:33 CDT

This archive was generated by hypermail 2.2.0 : Thu Jun 07 2012 - 05:08:34 CDT