Re: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)

From: Kenneth Whistler (
Date: Wed Jul 25 2007 - 15:11:16 CDT

    Philippe Verdy wrote:

    > The line breaking opportunities does not seem to handle some special cases
    > related to undesirable line breaks that are currently allowed.
    > This comes for example with parentheses, that currently always allow line
    > breaks after or before them and text they surround.

    That's not how I read UAX #14. I could be wrong, of course, but
    reading the Example Pair Table, it seems clear that the table
    specifies that such junctures are *indirect* line break opportunities,
    but then that is the same treatment you get for any pair
    of alphabetic characters in sequence, also.

    And in particular, the relevant rules are:

    LB28 Do not break between alphabetics.

      AL AL
    LB30 Do not break between letters, numbers, or ordinary symbols and
    opening or closing punctuation.

      (AL | NU) OP
      CL (AL | NU)
    Those rules seem *already* to be doing exactly what you seem to
    be asking for.

    Skipping over a fascinating excursion into French topynymy...

    > I can give another more common example where such linebreaks are
    > undesirable:
    > "un (ou plusieurs) mot(s)"
    > Note how the "s" plural mark in "mots" is marked as an alternative; it is
    > not separable from the word it normally completes. inserting a linebreak
    > between "mot" and "(s)" would be wrong.

    And UAX #14 does not suggest that one do so. See LB30 cited above.

    > I propose disallowing line breaks around ***BOTH*** sides of:
    > * (parentheses), or parenthese-like characters like
    > * [square brackets],

    etc., etc.

    This is already handled correctly in UAX #14.


