From: Kent Karlsson (firstname.lastname@example.org)
Date: Mon Jan 09 2006 - 12:08:47 CST
> > Would it be wise to typeset this *always* as U+0049 U+200D U+004A?
> > Or would U+0132 be a better choice? (Ditto for lower case.)
Yes. This is one of the reasons for the existence of U+0132 and U+0133.
Another reason is the casing property. If you don't care about either of
these using just I+J (and i+j) is just fine.
> Theoretically, U+0132 is a compatibility character with U+0049 U+004A
> as the compatibility decomposition.
It has the *standardised* (non-theoretical) decomposition: <compat> 0049
> Being a compatibility decomposable
> character, it is not recommended except in the representation
No, it does not say that. There are exceptions to that interpretation
of compatibility characters (and compatibility decomposable characters),
the IJ LIGATURE and the LONG S are among them. I think it is perfectly
to recommend their use in situations like this (and for long s it is the
way of getting a long s, barring certain experimental and ill-conceived
> of existing
> data in conditions where you need or wish to retain the
> difference between
> the character sequence "IJ" and the IJ ligature at the
> character level.
> Note that although the U+0132 indicates a ligature character, its
> decomposition does not include U+200D (word joiner) or any other
200D is ZERO WIDTH JOINER, 2060 is WORD JOINER. Neither is used in any
decomposition mapping except for themselves.
> indication of the ligature behavior. I'm not sure why this is so, but
> I can understand it as a consequence of treating U+0132 as a
> ligature of "I" and "J", following an orthographic and typographic
> tradition. Its full typographic meaning could not be
> expressed formally
> anyway, since it's not just a matter of applying _some_
> ligature behavior.
I'm not sure what your point is here.
> The _specific_ ligature behavior might be expressible at
> other protocol
> levels, such as typesetting instructions that map the
> sequence "IJ" to a
> particular ligature glyph or render it in a particular style.
> Then the pragmatics. Using U+200D simply doesn't work here. I
I would NOT expect it to form closer spacing, or the IJ ligature in any
ZWJ could be used to "recommend" the use of a typographic ligature, but
should not (IMO) be used to form *orthographic* ligatures (except in
scripts, where that is specified). The IJ ligature is somewhere between
typographic and orthographic ligatures, but I'd rather err on the safe
and consider it an orthographic ligature, like Œ and Æ. Unlike purely
typographic ligatures like for fi, fj, fö (see
fk, gj, tt, Th, and similar (which should be provided in better fonts
the "ink" for the separate letters otherwise would overlap in an ugly
This archive was generated by hypermail 2.1.5 : Mon Jan 09 2006 - 12:21:30 CST