From: fantasai (fantasai.lists@inkedblade.net)
Date: Wed Apr 20 2011 - 22:31:52 CDT
On 04/20/2011 07:58 PM, Philippe Verdy wrote:
> I disagree, because it breaks the inherent nature of the script. Joins
> in Arabic are mandatory, and create "super grapheme clusters".
Joins in Arabic are mandatory, and they are also broken across lines
for hyphenation.
> When you say that « it does not consider morphemic, syllabic, or other
> boundaries », this is already wrong because it already considers the
> default grapheme cluster boundaries. Note that the default grapheme
> boundaries were designed only to be locale neutral. But here we are
> speaking about localization where the language and its script will
> matter, including in its fundamental properties. Joining types in
> Arabic are key parts of the script.
Which is why the joining behavior is preserved even though it is broken
across lines.
> But in the previous part of the specification, nothing speaks about
> them, and all what is left on the upper levels where trying to find
> language-correct boundaries will fail. After this level, there shoudl
> still be a level related to the script itself (independantly of the
> language), before trying the last-chance "emergency" breaks. This
> intermediate level can still be prioritized, just as it was in the
> previous steps.
CSS does not prohibit such steps, but I do not think it should
prescribe them in this case. That's not what this feature is for.
> And yes, even in that case you could still insert the hyphenation
> symbol to show that the word was effectively broken (it is common
> practice to insert it, even in the Latin script and even if this is
> not the preferred syllabic or morphemic break position, which can only
> be infered by language specific rules and a lookup dictionnary for
> handling many exception cases).
"word-break: break-word" does not insert hyphens. Hyphenation is a
different feature.
> The hyphenation symbol is generally very narrow, and if needed, it
> cans still overflow a bit in the margin.
Note that overflowing even "a bit" still produces scrollbars.
> The choice of the hyphenation symbol is also a property of the script.
> In many East and South-East Asian scripts, there's not even any symbol
> for that, because break can occur between all grapheme clusters.
If you've got a pointer to resources indicating the correct hyphenation
symbol for various scripts or languages, I'd be interested in linking
that from the hyphenation section. :)
> Note: in Indic scripts, the danda or double-danda punctuations should
> be treated like the commas and stops in your spec and preferably not
> left alone on the next line, even if it falls within the margin (you
> showed cases for East-Asian scripts only : Han, Hiragana, Katakana,
> Hangul, Bopomofo, Yi, Mongolian...)
Are you talking about the rules for 'hanging-punctuation' or 'line-break'
or something else?
~fantasai
This archive was generated by hypermail 2.1.5 : Wed Apr 20 2011 - 22:34:25 CDT