Re: Potential contradiction between the WordBreak test data and UAX #29

From: Tom Hacohen <>
Date: Wed, 23 Nov 2016 12:04:30 +0000

On 23/11/16 11:45, Daniel Bünzli wrote:
> On Wednesday 23 November 2016 at 12:28, Tom Hacohen wrote:
>> I took a look at the ICU sources, and they explicitly mention this case,
>> so it seems I was mistaken with interpreting the intention of the UAX. I
>> still find it confusing, but based on this thread, it seems to just be me.
> It's not only you, I also sometimes get confused by it (see for example [1] and subsequent messages). Maybe the operational model could be clarified a bit.

The comment I quoted from the ICU sources clarifies the intention. Maybe
a comment similar to one would be helpful?

Also, thinking about it a bit more, the operational order makes sense
when you consider the CR LF case and extended characters, however it is
still not obvious from the wording.

Thanks again.

Received on Wed Nov 23 2016 - 06:05:10 CST

This archive was generated by hypermail 2.2.0 : Wed Nov 23 2016 - 06:05:10 CST