L2/03-234

Source: Ken Whistler
Date: August 5, 2003

Rick, Lisa, Cathy,

I think this note from Paul is a significant element of the
thread re the Hebrew characters, and should be entered as an
L2 document and included in the agenda with the other Hebrew
documents for discussion at the next UTC. If, as seems likely,
the UTC will accept some variant of the CGJ proposal as the
way to resolve the Hebrew combining class issue, then this
catalog of desiderata from the UTC and the editorial committee
for how to document the result would be a useful starting point
for considering how to follow up.

I'm cc'ing Paul, since I don't know if he already asked for this
or has any objection to his note being put into the record on
this basis.

--Ken

------------- Begin Forwarded Message -------------

Subject: RE: More on Meteg and CGJ
Date: Wed, 30 Jul 2003 05:23:16 -0700
Thread-Topic: More on Meteg and CGJ
To: "John Hudson" <tiro@tiro.com>, "Kenneth Whistler" <kenw@sybase.com>
Cc: <Joan_Wardell@sil.org>, <unicode@unicode.org>, <kenw@sybase.com>, "Ralph 
Hancock" <hancock@dircon.co.uk>

As Ken has pointed out, the CGJ is classified as a NSM (non-spacing
mark) and therefore should not be treated as a control character. I will
therefore, leave the glyph in stream. Some issues that are critically
important if this character is to be used as a "solution" for the
Biblical Hebrew problem.

1. The Unicode Editorial body needs to draft some documentation on the
use and behavior of the CGJ.
	- How font designers should handle that character. For example,
if show symbols is on, does the CGJ character have a visual
representation, or is it always hidden?
	- How NFC ordering is impacted when the character is used with
these cases of Biblical Hebrew. 
	- Suggested uses of the CGJ and indications where it should not
be used.

2. As a NSM, does the CGJ become part of the sorting and text matching
algorithms or not? This is an impact on users.

3. People making keyboard standards need to add this control character
to the keyboard so users can input the character. Of course, this will
confuse users who never see the CGJ or don't know about its existance
because there is never a visual clue it is present.


Personally, I still view using the CGJ as a "patch" because it is not
fixing the root problem. It is only putting a bandaid on top of the
wound. It would be much better to assign correct weights to the Hebrew
combining marks.

Regards,

Paul



-----Original Message-----
From: John Hudson [mailto:tiro@tiro.com] 
Sent: Tuesday, July 29, 2003 4:18 PM
To: Kenneth Whistler
Cc: Joan_Wardell@sil.org; unicode@unicode.org; kenw@sybase.com; Ralph
Hancock; Paul Nelson (TYPOGRAPHY)
Subject: Re: More on Meteg and CGJ

At 03:16 PM 7/29/2003, Kenneth Whistler wrote:


>How about:
>
>    shin < regular meteg < CGJ < hataf < dagesh < shindot
>
>The CGJ prevents the reordering of the meteg around the hataf and 
>dagesh, and the sequence <meteg, CGJ, hataf> gives the font a separate 
>sequence to ligate, distinguishing it from <hataf, dagesh, meteg> 
>above.


The meteg need to be to the left of, i.e. after, the hataf vowel:

shin < hataf < CGJ <meteg <dagesh <shindot

I can make this work, although it requires some fancy footwork in the
font: 
I need to remove the CGJ in order not to confuse the mark positioning
lookups, but do so without producing the same glyph string that results
in the medial meteg ligatation with the hataf vowel. This can be done by
including a second, unencoded meteg glyph in the font and substituting
this for the regular meteg whenever preceded by CGJ, then the CGJ is
removed and the new meteg positioned. [Not exactly tidy: since apps like
InDesign started presenting glyph sets to the user, I'm less happy about
including duplicate and potentially confusing glyphs.] In VOLT
expressions:

         #ccmp feature
                 #Second meteg lookup
                 meteg -> meteg.2
                 #in context:
                 CGJ |

                 #Remove CGJ lookup
                 <Any glyph> CGJ -> <Any glyph>

I *think* I can make pretty much any sequence involving CGJ work by
removing the CGJ glyph as an appropriately early stage in glyph
processing: 
it does its job in character ordering and then gets ditched in display,
having triggered any glyph substitutions necessary for further
processing. 
However, as noted before, this is entirely dependent on CGJ being
treated as a painted combining mark and *not* as an unpainted control
character. 
I'm still *very* nervous about this proposed solution if there is a
chance that applications will not paint this character.

John Hudson



Tiro Typeworks		www.tiro.com
Vancouver, BC		tiro@tiro.com



------------- End Forwarded Message -------------