**Previous message:**Kenneth Whistler: "Re: Unicode and the digital divide."**In reply to:**Mark Davis: "Re: 3 big bidi bugs"**Next in thread:**Markus Scherer: "Re: from 4 to null (was: 3 big bidi bugs)"**Reply:**Markus Scherer: "Re: from 4 to null (was: 3 big bidi bugs)"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

Mark Davis wrote:

*> One could wish for a simpler algorithm (for that matter, one could
*

*> wish that people had uniform writing directions, or that Brits would
*

*> drive on the right side of the road). As to ByText, you are on your
*

*> own (in many ways).
*

ByText? What’s that? One could wish for a simpler algorithm and get it using

Bytext. Anyway, I didn’t ask about Bytext.. I know all about it :) --I did

however ask if you could verify that no implementation of the Unicode

bidirectional algorithm works with the Unicode 3.20 compatible “from 4 to

null” logic (below). It should be easy for you to do this at least for ICU,

but you were strangely silent on this question. Bidirectional users might

like to know that there are no bugs in their <= 3.20 implementations.

___from “3 bidi bugs” thread:

Let's say you have a line consisting of characters with all embedding level

4... How is "3" considered to be the lowest odd level on that line? It's no

more the lowest odd level than 5 or 1 is. At best, if you consider a

character with embedding level 4 to actually consist of 4 and each lower

embedding level (4, 3, 2, 1, and zero), which is not entirely unreasonable,

then 1 will always be the lowest odd embedding level on every line except a

line consisting of all zero's. But since L2 doesn't say "...to 1", it rules

out this interpretation.

A function implementing L2 might go thru the following steps on each line:

1. find the highest level

2. find the lowest odd level

...

For a line consisting of all 4's as above, step 1 will return 4 and step 2

should return null since there are no odd levels on the line. A list

consisting of "from 4 to null" can only reasonably be interpreted as

consisting only of 4. Going on with this you get the "bugs" I describe.

___

--- Bernard Rafael Miller, email: bernard_r_miller@bytext.org Format enabling simplified 8 bit regexes of UCS characters: www.bytext.org --- "Progress is a nice word. But change is it's motivator and change has it's enemies." --Robert F. Kennedy

**Previous message:**Kenneth Whistler: "Re: Unicode and the digital divide."**In reply to:**Mark Davis: "Re: 3 big bidi bugs"**Next in thread:**Markus Scherer: "Re: from 4 to null (was: 3 big bidi bugs)"**Reply:**Markus Scherer: "Re: from 4 to null (was: 3 big bidi bugs)"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

*
This archive was generated by hypermail 2.1.2
: Fri May 31 2002 - 15:36:45 EDT
*