Re: Public Review Issue 232 Proposed Update UAX #9, Unicode Bidirectional Algorithm (Copy of email sent to the list; also posted by me to unicode feedback/public review issue-- but this has not yet posted there)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Fri, 25 Jan 2013 05:21:39 +0100

Letter-like mathematical symbols are those like Product (Greek capital
Pi), Sum (Greek capital Sigma). Mirroring them by default would have
strange effects, even if they may be mirrored in formulas.
Lower-than and Higher-than symbols are not letter-like and are safe to
mirror, they behave like parentheses.

Ornate-parentheses should be mirrorablen, even if they are used mostly
in RTL texts (but why couldn't they be used to surround some Latin
words in an Arabic text ?)

Our problem for now is the "best-fit" pairing (rule HL6 really hurts
and offers absolutely no benefit, it causes havoc an interoperability
problems ; it should be admitted ONLY for PUA characters, or for
out-of-band markup syntax in a rich-text format because they are
supposed to be rendered as isolates in code editors). For the
stability of BiDi, this should not happen. Mirrorable pairs should be
definitely stabilized (and for all exceptions, which include non
Bidi-neutral or strong direction, new characters encoded if needed,
which won't need any BiDi control as their direction will be fixed and
they will never be mirrored).

2013/1/24 CE Whitehead <cewcathar_at_hotmail.com>:
> Hi.
>> From: verdy_p_at_wanadoo.fr
>> Date: Mon, 21 Jan 2013 06:06:17 +0100
>> Subject: Re: Public Review Issue 232 Proposed Update UAX #9, Unicode
>> Bidirectional Algorithm (Copy of email sent to the list; also posted by me
>> to unicode feedback/public review issue-- but this has not yet posted there)
>> To: cewcathar_at_hotmail.com
>> CC: mgrzegor_at_poczta.onet.pl; markdavis_at_google.com; aharon_at_google.com
>
>>
>> 2013/1/20 CE Whitehead <cewcathar_at_hotmail.com>:
>> >
>> > Hi,, this is just a copy of the comments I sent to the Unicode Comments
>> > regarding this issue (and also emailed to Rick), in case these somehow
>> > don't
>> > make it into the feedback (I think I still have a day or so).
>>
>> In fact my comment sent to the UTC report form was about a current
>> problem of the UBA algorithm : its result should become stabilized by
>> a policy, but the properties it depends on are not. This is the case
>> of general category and of the mirrored pairs. Most probably we should
>> not stabilize the general category, but depend on a more explit
>> property for the BiDi algorithm itself.
> Hmm. You are ahead of me maybe, because I don't see that much what
> difference it makes where this stabilization is done for this case.
> Here's what's in the bidi algorithm (not much about compatibility, but there
> is in the data file the code point for the mirrored glyph of each character
> that can be mirrored, when there is one):
> "In implementation, sometimes pairs of characters are acceptable mirrors for
> one another-for example, U+0028 "(" LEFT PARENTHESIS and U+0029 ")" RIGHT
> PARENTHESIS or U+22E0 "?" DOES NOT PRECEDE OR EQUAL and U+22E1 "?" DOES NOT
> SUCCEED OR EQUAL. Other characters such as U+2231 "?" CLOCKWISE INTEGRAL do
> not have corresponding characters that can be used for acceptable mirrors.
> The informative Bidi Mirroring data file [Data9],
> {MY NOTE: for Data9 see
> http://www.unicode.org/Public/UNIDATA/BidiMirroring.txt}
> lists the paired characters with acceptable mirror glyphs. The formal
> property name for this data in the Unicode Character Database [UCD] is
> Bidi_Mirroring_Glyph. A comment in the file indicates where the pairs are
> "best fit": they should be acceptable in rendering, although ideally the
> mirrored glyphs may have somewhat different shapes."
> Also
> "4. A character is depicted by a mirrored glyph if and only if (a) the
> resolved directionality of that character is R, and (b) the Bidi_Mirrored
> property value of that character is true.
>
> The Bidi_Mirrored property is defined by Section 4.7, Bidi
> Mirrored-Normative of [Unicode]; the property values are specified in [UCD].
> This rule can be overridden in certain cases; see HL6.
>
> "For example, U+0028 left parenthesis-which is interpreted in the Unicode
> Standard as an opening parenthesis-appears as "(" when its resolved level is
> even, and as the mirrored glyph ")" when its resolved level is odd. Note
> that for backward compatibility the characters U+FD3E (?) ORNATE LEFT
> PARENTHESIS and U+FD3F (?) ORNATE RIGHT PARENTHESIS are not mirrored."
> Regarding your comments below (sorry I have to respond here; hotmail does
> not work well in my Mozilla browser on this laptop though it works with
> Mozilla on some pcs; maybe I should switch to IE for my browser),what do you
> mean by mathematicaly symbols that are letter-like but mirrorable (operators
> such as > or < and parentheses are the ones that are mirrorable & that would
> work with the bidi parentheses algorithm in my idea (see
> http://www.unicode.org/review/pri231/pri231-background.pdf section 3.2 for
> what works if you need a link); so which mirrorable mathematical symbols are
> letter-like? and mirrorarble? (you mean suitable for the bidi parentheses
> algorithm, right?): greater than? less than? I can't really think of which
> are letter-like; so examples would help.
> Thanks and sorry for responding above rather than below your comment on this
> (P.S. I decided to cc the list with these comments instead of all the
> individuals cc'd in the original email; hope that's o.k.)
>
>
> Best,
>
> --C. E. Whitehead
> cewcathar_at_hotmail.com
>> And there are still validity constraints that are not checked for the
>> mapping of mirrored pairs (when they exist, because not all characters
>> are encoded as mirrored pairs, even if they are BiDi-neutral and
>> mirrorable : this includes many mathematical symbols and operators
>> that are not letter-like ; mathematical symbols that are letterlike
>> but still mirrorable should be encoded as separate characters because
>> they are not BiDi-neutral : this is a justification for enhancing the
>> stability rules as it impacts the policy about which characters are
>> encodable and not others).
Received on Thu Jan 24 2013 - 22:26:01 CST

This archive was generated by hypermail 2.2.0 : Thu Jan 24 2013 - 22:26:03 CST