Unicode Bidirectional Algorithm

From: QSJN 4 UKR <qsjn4ukr_at_gmail.com>
Date: Mon, 12 Nov 2012 13:34:46 +0200

Input: <L1><RLO><NSM><L2><L3><PDF><L4>
output: <L1.l><L3.r><L2.r><NSM.r><L4.l>
How to render?
You say «L3. Combining marks applied to a right-to-left base character
will at this point precede their base character. If the rendering
engine expects them to follow the base characters in the final display
process, then the ordering of the marks and the base character must be
reversed». Not that case. «Defective combining character sequences
should be rendered as if they had a no-break space as a base
character». Aha! Beautiful. What for to use the default ignorable
codepoints as the base for the NSM?

<html>
<body>
<p>
a&#x202E;&#x0301;bc&#x202C;d
</p>
<p>
a&#x202E;&#x0301;bc&#x0301;de&#x0301;&#x202C;fg
</p>
</body>
</html>

It is
acbd
áed́cbfg

or maybe
acb́d
áed́cb́fg

It has to be:
acb ́d
aédćb ́fg
has not?

Why not [[
X6. For all types besides NSM, BN, RLE, LRE, RLO, LRO, and PDF:
a. Set the level of the current character to the current embedding level.
b. Whenever the directional override status is not neutral, reset the
current character type according to the directional override status.

X9bis. Search backward from each instance of a NSM until the first
character of other type is found. Change the type of the NSM to the
type of the found character and set its level to the level of the
found character.

W1. (delete)
]]? In other words, make bidi absolutely transparent for NSM just like
it is for ZWJ/NJ.
Received on Mon Nov 12 2012 - 05:38:37 CST

This archive was generated by hypermail 2.2.0 : Mon Nov 12 2012 - 05:38:38 CST