From: Richard Wordingham (
Date: Wed Jun 14 2006 - 17:50:24 CDT

    The proposed summary of CGJ in the Unicode 5.0 glyph charts still says,
    'indicates that adjoining characters are to be treated as a graphemic unit'.
    This has been completely wrong since TUS 4.1. What it now means is
    something like, 'indicates that adjoining characters do not interact
    non-graphically as one would otherwise expect'. The three examples I can
    think of are:

    (i) Do not swap places under normalisation (e.g. Hebrew metheg hiriq v.
    hiriq metheg)
    (ii) Following U+0308 forms diaeresis, not umlaut, in Fraktur.
    (iii) Do not form a normal 'contraction' for collation (e.g. CH in Slovak or
    NG in Welsh).

    In particular the graphemic unit, if any, is not tight enough for enclosing
    diacritics to treat it as a unit. <X, CGJ, Y, U+20DD COMBINING ENCLOSING
    CIRCLE> is (in general) X followed by circled Y, not encircled XY. On the
    other hand, a Fraktur vowel with diaeresis remains a default grapheme


