From: Philippe Verdy (firstname.lastname@example.org)
Date: Fri Jul 27 2007 - 08:05:41 CDT
OK, this proves that ICU works correctly including in the other cases I cited.
Anyway, update 20 of UAX#14 does not seem to have affected it, because it contains mostly editorial changes for clarity.
Why then my suggested changed for clarity of rule LB30 may not be considered too, given that it won’t change anything in ICU?
I can also see that the current code does not split:
Between “Text” and the kanas because of the parenthese.
But if I remove the parentheses, a break now correctly occurs in:
Isn't it incoherent? And one case where my refined rule (treating opening/closing punctuations as if they were grapheme extenders of the bordering grapheme of the inner text) would be better? Yes I can see that this change would require an additional forward lookup to skip the opening/closing punctuation. I suspect that this case will occur in Japanese/Chinese where some limited inclusions of Latin are present without any space separation, and that may also be written with or without delimiting punctuation.
De : email@example.com [mailto:firstname.lastname@example.org] De la part de Mark Davis
Envoyé : vendredi 27 juillet 2007 02:05
À : Kenneth Whistler
Cc : email@example.com; firstname.lastname@example.org
Objet : Re: UAX#14-20: undesriable line breaking opportunities (parenthese and quotation marks)
BTW, http://unicode.org/cldr/utility/breaks.jsp is a demo that shows how parens work differently in different contexts. Paste in some sample text like
Sample Text(s). ユニコードとは(何)か？
This archive was generated by hypermail 2.1.5 : Fri Jul 27 2007 - 08:10:04 CDT