Trying to understand Line_Break property apparent discrepancy

From: Karl Williamson <public_at_khwilliamson.com>
Date: Mon, 11 Jan 2016 15:42:37 -0700

It appears that
http://www.unicode.org/Public/8.0.0/ucd/auxiliary/LineBreakTest.txt is
testing a tailoring rather than the default line break algorithm,
contrary to its heading "# Default Line Break Test". And
http://www.unicode.org/Public/UCD/latest/ucd/auxiliary/LineBreakTest.html follows
along.

For example, the default algorithm as shown in
http://www.unicode.org/reports/tr14/#Table2 follows LB25, which is an
approximation of the desired behavior. But the test and html don't
follow this. I suspect they are looking for the tailoring described in
http://www.unicode.org/reports/tr14/#Examples example 7.

For example, the test file tests for, and the html says that a class CL
code point followed by a class PO one is an unconditional line break
opportunity, based on rule 999. (which is the same as LB31 in TR14)

Whereas, http://www.unicode.org/reports/tr14/#Table2 says that a class
CL code point followed by a class PO one is an

         "indirect break opportunity B % A is equivalent to B × A and B SP+ ÷
A; in other words, do not break before A, unless one or more spaces
follow B." This is by LB25 and LB18.

There is a discrepancy here, which could be resolved either by changing
the tests and html to follow LB25, or documenting that these are for
something above and beyond the default algorithm. (There may also be
other discrepancies that I haven't stumbled against)
Received on Mon Jan 11 2016 - 16:44:22 CST

This archive was generated by hypermail 2.2.0 : Mon Jan 11 2016 - 16:44:23 CST