Re: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)

From: Mark Davis (mark.davis@icu-project.org)
Date: Thu Jul 26 2007 - 19:04:34 CDT

  • Next message: Asmus Freytag: "Re: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)"

    BTW, http://unicode.org/cldr/utility/breaks.jsp is a demo that shows how
    parens work differently in different contexts. Paste in some sample text
    like

    Sample Text(s). ユニコードとは(何)か?
    Pick "Line"
    Click "Test"

    Mark

    On 7/25/07, Kenneth Whistler <kenw@sybase.com> wrote:
    >
    > Philippe Verdy wrote:
    >
    > > The line breaking opportunities does not seem to handle some special
    > cases
    > > related to undesirable line breaks that are currently allowed.
    > > This comes for example with parentheses, that currently always allow
    > line
    > > breaks after or before them and text they surround.
    >
    > That's not how I read UAX #14. I could be wrong, of course, but
    > reading the Example Pair Table, it seems clear that the table
    > specifies that such junctures are *indirect* line break opportunities,
    > but then that is the same treatment you get for any pair
    > of alphabetic characters in sequence, also.
    >
    > And in particular, the relevant rules are:
    >
    > LB28 Do not break between alphabetics.
    >
    > AL × AL
    >
    > LB30 Do not break between letters, numbers, or ordinary symbols and
    > opening or closing punctuation.
    >
    > (AL | NU) × OP
    > CL × (AL | NU)
    >
    > Those rules seem *already* to be doing exactly what you seem to
    > be asking for.
    >
    > Skipping over a fascinating excursion into French topynymy...
    >
    > > I can give another more common example where such linebreaks are
    > > undesirable:
    > > "un (ou plusieurs) mot(s)"
    > > Note how the "s" plural mark in "mots" is marked as an alternative; it
    > is
    > > not separable from the word it normally completes. inserting a linebreak
    > > between "mot" and "(s)" would be wrong.
    >
    > And UAX #14 does not suggest that one do so. See LB30 cited above.
    >
    >
    > > I propose disallowing line breaks around ***BOTH*** sides of:
    > > * (parentheses), or parenthese-like characters like
    > > * [square brackets],
    >
    > etc., etc.
    >
    > This is already handled correctly in UAX #14.
    >
    > --Ken
    >
    >
    >
    >
    >

    -- 
    Mark
    


    This archive was generated by hypermail 2.1.5 : Thu Jul 26 2007 - 19:06:20 CDT