RE: FAQ entry (was: Looking for information on the UnicodeData file)

From: Kent Karlsson (
Date: Fri Mar 07 2003 - 07:51:07 EST

  • Next message: David Oftedal: "Re: FAQ entry"

    The names do NOT always provide correct descriptions of the
    characters. This is especially true for "digraph" and "ligature"
    (and in the case of U+00E6 too), as well as (e.g.) SCRIPT CAPITAL P,
    which is neither script, nor capital (it's lowercase), though
    it is a p... In addition, there are different flavours of ligatures.
    E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed
    by an i, no ligation, whereas that is not allowed for the ae
    ligature/letter, nor for the oe ligature.

            /kent k

    From: Pim Blokland

    > John Cowan schreef:
    > > Digraphs and ligatures are both made by combining two glyphs. In a
    > digraph,
    > > the glyphs remain separate but are placed close together.
    > In a ligature,
    > > the glyphs are fused into a single glyph.
    > Oh, in that case I must say I think the UnicodeData.txt file
    > doesn't do a
    > very good job.
    > For instance, the Danish ae (U+00E6) is not designated a
    > ligature, but the
    > Dutch ij (U+0133) is, even though the "a" and "e" are clearly fused
    > together, while the "i" and "j" aren't.

    This archive was generated by hypermail 2.1.5 : Fri Mar 07 2003 - 08:30:17 EST