From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Nov 26 2004 - 13:13:46 CST
From: "Mark Davis" <mark.davis@jtcsv.com>
>I want to correct some misperceptions about CGJ; it should not be used for
> ligatures.
True. CGJ is a combining character that extends the grapheme cluster started 
before it, but it does not imply any linking with the next grapheme cluster 
starting at a base character.
So, even if one encodes, A+CGJ+E, there will still be two distinct grapheme 
clusters A+CGJ and E, and the exact role of the trailing CGJ in the A+CGJ is 
probably just a pollution, given that this CGJ has no influence on the 
collation order, so that the sequence A+CGJ+E will collate like A+E, and it 
does not influence the rendering as well.
A "correct" ligaturing would be A+ZWJ+E, with the effect of creating three 
default grapheme clusters, that can be rendered as a single ligature, or as 
separate A and E glyphs if the ZWJ is ignored.
For example, a ligaturing opportunity can be encoded explicitly in the 
French word "efficace":
"ef"+ZWJ+"f"+ZWJ+"icace".
Note however that the ZWJ prohibits breaking, despite in French there's a 
possible hyphenation at the first occurence, where it is also a syllable 
break, but not for the second occurence that occurs in the middle of the 
second syllable.
I don't know how one can encode an explicit ligaturing opportunity, while 
also encoding the possibility of an hyphenation (where the sequence above 
would be rendered as if the first ZWJ had been replaced by an hyphen 
followed a newline.)
To encode the hyphenation opportunity, normally I would use the SHY format 
control (soft hyphen):
"ef"+SHY+"fi"+SHY+"ca"+SHY+"ce"
If I want to encode explicit ligatures for the "ffi" cluster, if it is not 
hyphenated, I need to add ZWJ:
"ef"+ZWJ+SHY+"f"+ZWJ+"i"+SHY+"ca"+SHY+"ce"    (1)
The problem is whever ZWJ will have the expected role of enabling a ligature 
if it is inserted between a letter and a SHY, instead of the two ligated 
glyphs. In any case, the ligature should not be rendered if hyphenation does 
occur, else the SHY should be ignored. So two rendering are to be generated 
depending on the presence or absence of the conditional syllable break:
- syllable break occurs, render as: "ef-"+NL+"f"+ZWJ+"icace", i.e. with a 
ligature only for the "fi" pair, but not for the "ff" pair and not even for 
the generated "f"+hyphen...
- syllable break does not occur, render as "ef"+ZWJ+"f"+ZWJ+"icace", i.e. 
with the 3-letter "ffi" ligature...
I am not sure if the string coded as (1) above has the expected behavior, 
including for collation where it should still collate like the unmarked word 
"efficace"...
This archive was generated by hypermail 2.1.5 : Fri Nov 26 2004 - 13:14:34 CST