Re: Questionable lines on LineBreakTest.txt

From: Mark Davis ☕ (mark@macchiato.com)
Date: Mon Jun 07 2010 - 20:24:11 CDT

  • Next message: CE Whitehead: "RE: Hexadecimal digits"

    If the test files are "known to be in error", then those "known" cases need
    to be actually communicated back to the UTC; sitting on them doesn't do
    anyone any good.

    I have not had a chance to investigate, but this particular case may be
    covered by the description in
    http://unicode.org/Public/6.0.0/ucd/auxiliary/LineBreakTest-6.0.0d4.html:

    The Line Break tests use tailoring of numbers described in Example 7 of
    Section 8.2 Examples of Customization.

    Mark

    — Il meglio è l’inimico del bene —

    On Mon, Jun 7, 2010 at 17:11, Asmus Freytag <asmusf@ix.netcom.com> wrote:

    > On 6/7/2010 4:26 PM, Masaaki Shibata wrote:
    >
    >> I'm studying the UAX #14 (5.2.0) and testing my code against
    >> LineBreakTest.txt. And I found some test cases on this text file seem
    >> to be contradictory to the rules on the document.
    >>
    >> For example, LB25 explicitly prohibits breaking between CP and PO,
    >> while LineBreakTest.txt says "÷ [0.2] RIGHT PARENTHESIS (CP) ÷ [999.0]
    >> PERCENT SIGN (PO) ÷ [0.3]" (l. 1137).
    >>
    >> I'm not a Unicode expert; which rules lead to the result like this?
    >> Did I miss any important descriptions on the document?
    >>
    >>
    > Probably not. The test file has been known to be wrong before.
    >
    > The spec clearly states that breaks are only allowed if there are spaces,
    > as in:
    >
    > CP SP+ ÷ OP
    >
    > So this line in the "test" file appears incorrect.
    >
    > A./
    >
    >>
    >>
    >>
    >>
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Mon Jun 07 2010 - 20:27:24 CDT