Re: Potential contradiction between the WordBreak test data and UAX #29

From: Tom Hacohen <tom_at_osg.samsung.com>
Date: Wed, 23 Nov 2016 11:14:09 +0000

On 23/11/16 11:11, Daniel Bünzli wrote:
>
> On Wednesday 23 November 2016 at 12:00, Tom Hacohen wrote:
>> This looks like a mistake statement rather than a binding rule.
> Well at least to me it's pretty clear that this is not the case.
>
>
>> Even if that's true, look at my second statement (which you redacted in
>> your reply):
>
> I'm not arguing whether the boundaries produced by this process is good or not. I'm just saying that to me, the test data is consistent with the operational model and rules of UAX#29 as it exists.

I'm arguing it's not, and I still don't agree with your understanding of
the operational model, again, take a look at what I wrote in my last email:

Also take another look at
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_and_Format_Rules
specifically the table that shows another way of writing the ignore
rule. This again shows my understanding of rule 4 is correct.

Specially look at the following equivalence:
X Y × Z W ⇒ X (Extend | Format)* Y (Extend | Format)* × Z
(Extend | Format)* W

--
Tom
Received on Wed Nov 23 2016 - 05:14:27 CST

This archive was generated by hypermail 2.2.0 : Wed Nov 23 2016 - 05:14:27 CST