Problem in Line breaking

From: Satoshi Nakagawa (
Date: Sat Feb 23 2008 - 13:48:18 CST

  • Next message: Jeroen Ruigrok van der Werven: "Re: Problem in Line breaking"


    I found a problem in the Unicode line breaking algorithm.

    In Japanese writing, [こたえは、answer] should be breakable into
    lines like:


    Because [、](U+3001) and [。](U+3002) in Japanese are used just like
    comma and period in English. We can break a line after comma or
    period in English.

    But the current Unicode line breaking algorithm doesn't allow this
    behavior for (U+3001) and (U+3002).

    I think it's a problem of the Unicode line breaking algorithm.
    See .

    > CL: Closing Punctuation (XB)

    (U+3001) and (U+3002) are specified as CL.

    > LB30
    > Do not break between letters, numbers, or ordinary symbols and
    > opening or closing punctuation.
    > CL × (AL | NU)

    It says CL and a subsequent alphabetic or numeric token is not
    breakable. In the result, we cannot do line breaking in any positions
    of [は、answer].

    IMHO, (U+3001) and (U+3002) should not be treated as CL. Because we
    cannot apply LB30 to them. They should be separated as a different

    What do you think?

    Satoshi Nakagawa

    This archive was generated by hypermail 2.1.5 : Sun Feb 24 2008 - 12:08:34 CST