RE: IJ joint in spaced lettering

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Jan 09 2006 - 13:30:02 CST

  • Next message: Asmus Freytag: "Re: External Link arrow"

    On Mon, 9 Jan 2006, Kent Karlsson wrote:

    >> Theoretically, U+0132 is a compatibility character with U+0049 U+004A
    >> as the compatibility decomposition.
    >
    > It has the *standardised* (non-theoretical) decomposition: <compat> 0049
    > 004A.

    The word "Theoretically" meant that I first considered how things are in
    principle, by the Unicode standard.

    >> Being a compatibility decomposable
    >> character, it is not recommended except in the representation
    >
    > No, it does not say that.

    "Compatibility decomposable characters are a subset of compatibility
    characters included in the Unicode Standard to represent distinctions in
    other base standards. They support transmission and processing of legacy
    data. Their use is discouraged other than for legacy data or other special
    circumstances."

       Definition D21 in section 3,
       http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf#G748

    > There are exceptions to that interpretation
    > of compatibility characters (and compatibility decomposable characters),
    > the IJ LIGATURE and the LONG S are among them. I think it is perfectly
    > fine to recommend their use in situations like this

    I think so too; we seem to agree on the practical point. But I discussed
    what the standard says (in a somewhat odd place, but the same general idea
    can be seen elsewhere in the standard, too).

    >> Note that although the U+0132 indicates a ligature character, its
    >> decomposition does not include U+200D (word joiner) or any other
    >
    > 200D is ZERO WIDTH JOINER, 2060 is WORD JOINER. Neither is used in any
    > decomposition mapping except for themselves.

    Right. (I was thinking whether I should mention the difference, and of
    course the wrong name crept into my text. :-( )

    > ZWJ could be used to "recommend" the use of a typographic ligature, but
    > should not (IMO) be used to form *orthographic* ligatures

    Such a distinction does not exist in the Unicode standard, and as you
    mention, the IJ ligature would be a borderline case anyway.

    Especially considering the classification of the ij ligature as a letter
    in Dutch, we might say that it should really have been defined as a
    primary (non-compatibility) character, much the same way as the oe
    ligature and the ae ligature (which is now even called "letter ae",
    not a ligature, though it's still effective used as a ligature, too).
    But it's too late to change that now. (Maybe some official statement,
    constituting an explicit exception to the principle of avoiding
    compatibility decomposable characters, would be in order.)

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Mon Jan 09 2006 - 13:31:58 CST