CGJ and ZWJ (was Re: Currency symbols)

From: Kenneth Whistler
Date: Mon Mar 10 2003 - 21:47:48 EST

    Antonio asked:

    > On 2003.02.25, 19:36, Asmus Freytag <> wrote:
    > > At 12:55 PM 2/25/03 +0000, Anto'nio Martins-Tuva'lkin wrote:
    > >
    > > > Most (all?) of them are composable, either by means of letter +
    > > > slash (OSLI) or by ZWJ (for things like "Pta" or "Pts", if
    > > > anything),
    > >
    > > Using ZWJ for such things is frowned upon. The ZWJ [is] not a general
    > > purpose compositor.
    > Sorry. I mean such an invisible character that would keep those letters
    > toghether, even when the inter-character space is expanded, like as if
    > they were in the same "lead type". (The same thing I'd use decompose
    > U+0133 into i+THING+j.)
    > What Unicode character should be used for this, then?
    > > The ZWJ may be used to request a ligature between two characters,
    > Isn't this the role of CGJ (combining grapheme joiner)? «Indicates that
    > the adjoining characters are to be treated as a graphemic unit.»

    While the language has been confusing, the intent is the following.

    ZWJ/ZWNJ are used for control of cursive connection (as for Arabic),
        to affect exact glyph shaping in various Indic scripts,
        and to request ligation or non-ligation in various scripts.
        Think of them as a non-displaying "joining context" which is
        used by a rendering engine (or font) to influence the exact
        display of glyphs -- and in particular their visible connection
        to one another.
    CGJ (COMBINING GRAPHEME JOINER) is used to connect two (or more)
        characters together into a *logical* unit for the purposes
        of some processing. It is intended to create exceptional
        units (only if required) for processes such as boundary
        determination or sorting.
        Think of it as a character "gluer" that has no impact on
        display, per se.

