From: Peter Constable (firstname.lastname@example.org)
Date: Thu Dec 06 2007 - 10:51:09 CST
> From: email@example.com [mailto:firstname.lastname@example.org] On
> Behalf Of Karl Pentzlin
Reply in opposite order:
> b.) Why U+FD3E and U+FD3F have the Bidi_mirroring property not set?
IIRC, this is by design for back-compat reasons. I believe it has been discussed on this list before.
> This leads to my questions:
> a.) Why U+FD3E has GC property Ps and U+FD3F has Pe, and not vice
Good question. Primary usage with Arabic seems to suggest vice versa. Mind, since in principle they can be used in either direction, something neutral such as Po might make sense. A key question to consider is what derived properties and algorithms would be affected by a change. For instance, switching Ps/Pe values for these characters would have a follow-on effect for line breaking:
FD3E gc=Ps, lb=OP
FD3F gc=Pe, lb=CL
FD3E gc=Pe, lb=CL
FD3F gc=Ps, lb=OP
That would result in a significant change in line-breaking behaviour, though it would probably be an improvement for use in Arabic text (and detrimental for use in LTR text). But changing to a neutral category such as Po would have far more substantial impact on line breaking since both would have lb=AL; in particular, neither would behave particularly like closing punctuation.
There are no contingent line-breaking properties -- break this way for RTL but that way for LTR. So, there's no way to assign properties to these characters that provide the desired behaviour in all scenarios. Since -- at least, for line breaking -- a tailoring is needed to do the right thing in all cases, perhaps there's not a lot of value in changing the properties.
This archive was generated by hypermail 2.1.5 : Thu Dec 06 2007 - 10:54:39 CST