RE: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Jul 27 2007 - 08:05:41 CDT

  • Next message: Philippe Verdy: "RE: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)"

    OK, this proves that ICU works correctly including in the other cases I cited.
    Anyway, update 20 of UAX#14 does not seem to have affected it, because it contains mostly editorial changes for clarity.
    Why then my suggested changed for clarity of rule LB30 may not be considered too, given that it won’t change anything in ICU?

    I can also see that the current code does not split:
           Sample Text(ユニコードとは何か)
    Between “Text” and the kanas because of the parenthese.

    But if I remove the parentheses, a break now correctly occurs in:
            Sample Textユニコードとは何か
    Isn't it incoherent? And one case where my refined rule (treating opening/closing punctuations as if they were grapheme extenders of the bordering grapheme of the inner text) would be better? Yes I can see that this change would require an additional forward lookup to skip the opening/closing punctuation. I suspect that this case will occur in Japanese/Chinese where some limited inclusions of Latin are present without any space separation, and that may also be written with or without delimiting punctuation.

    ________________________________________
    De : unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] De la part de Mark Davis
    Envoyé : vendredi 27 juillet 2007 02:05
    À : Kenneth Whistler
    Cc : verdy_p@wanadoo.fr; unicode@unicode.org
    Objet : Re: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)

    BTW, http://unicode.org/cldr/utility/breaks.jsp is a demo that shows how parens work differently in different contexts. Paste in some sample text like

    Sample Text(s). ユニコードとは(何)か?
    Pick "Line"
    Click "Test"



    This archive was generated by hypermail 2.1.5 : Fri Jul 27 2007 - 08:10:04 CDT