Re: Questionable lines on LineBreakTest.txt

From: Masaaki Shibata (shibatamasaaki@gmail.com)
Date: Tue Jun 08 2010 - 20:04:48 CDT

  • Next message: Konstantin Ritt: "Re: Questionable lines on LineBreakTest.txt"

    > Note: The Line Break tests use tailoring of numbers described in Example 7 of Section 8.2 Examples of Customization.

    To tell the truth, I couldn't get this line on the LineBreakTest.txt.
    What does this mean? Why this line is added in 5.2.0?

    The result of CP x PO must be the same on both rule-based and regex
    implementation (i.e. "") anyhow.

    By the way, I found a possible error on the UAX #14 document (5.2.0).

    Compare the descriptions on LB24, LB25 and Example 7 of chapter 8
    Customization with the previous version (5.1.0) of that. On version
    5.2.0, they added new line breaking class, CP (see Change History),
    and adjusted other rules for their change. The regular expression just
    before LB25 must have been one of them:

    5.1.0
    ( PR | PO ) ? ( OP | HY ) ? NU (NU | SY | IS) * CL ? ( PR | PO ) ?
    5.2.0
    ( PR | PO) ? ( OP | HY ) ? NU (NU | SY | IS) * (CL | CP) ? ( PR | PO) ?

    But they seemed to have forgot the one on the Example 7 of chapter 8:

    5.1.0
    ( PR | PO ) ? ( OP | HY ) ? NU (NU | SY | IS) * CL ? ( PR | PO ) ?
    5.2.0 (and 6.0.0 draft!)
    ( PR | PO ) ? ( OP | HY ) ? NU (NU | SY | IS) * CL ? ( PR | PO ) ?

    I think this should be also adjusted to the change. (This does not
    affect our argument much though, because 5.1.0 version of
    LineBreakTest.txt has the same problem as saying "CL PO". These are
    just incorrect.)



    This archive was generated by hypermail 2.1.5 : Tue Jun 08 2010 - 20:11:16 CDT