Ligatures with diacritics (was: Ancient Northwest Semitic Script)

From: Peter Kirk (peterkirk@qaya.org)
Date: Tue Dec 30 2003 - 16:13:16 EST

  • Next message: Chris Jacobs: "Re: Ligatures with diacritics (was: Ancient Northwest Semitic Script)"

    On 30/12/2003 11:44, John Hudson wrote:

    > At 11:15 AM 12/30/2003, Peter Kirk wrote:
    >
    >>> Even if it were verified, it isn't a good case for encoding a
    >>> separate character *equivalent* to a combination of two existing
    >>> characters: that's a glyph variant ligature.
    >>
    >>
    >> Actually, I don't think so. The separate character was not formed by
    >> merging the dot into the letter, rather the distinction was made in a
    >> different way.
    >
    >
    > In modern digital font development, ligation refers to the mechanism
    > of display, not the visual appearance, which is largely irrelevant. A
    > ligature is any glyph that represents two or more characters,
    > typically arrived at by a ligation lookup. If I wanted a special sin
    > glyph *equivalent* to the character sequence <shin, sindot>, I would
    > ligate the two characters to that single glyph, either directly
    >
    > shin sindot -> sin
    >
    > or via a two-stage stylistic variant lookup associated with a
    > different typographic feature
    >
    > shin sindot -> shin_sindot
    > and then
    > shin_sindot -> sin
    >
    >
    I understand this, and, as I answered separately, I don't think this is
    the appopriate mechanism in this case as the suggested ligature is not
    fully equivalent to the sequence.

    But if it were, this ligature would be very interesting and problematic
    because it is a ligature between a base character and a diacritic. This
    is not a problem if it is always used, in a particular font, but it is
    problematic if the ligature is optional. This is because ZWNJ and ZWJ
    cannot be used between base characters and diacritics because they break
    the combining sequence. We came across this problem before with Hebrew
    script, but in a rather different (and less ambiguous) context, that of
    the need for a ligature between meteg and hataf vowels.

    I wonder if there are other, better defined, cases of ligatures between
    base characters and diacritics in other scripts, i.e. cases where there
    is an optional alternative to base character plus diacritic which does
    not look like the base character plus the diacritic. Candidates like ø
    as an alternative for ö are ruled out because they are already
    separately encoded. I have certainly seen glyphs rather like U+0255 used
    for c cedilla. In the light of recent discussions, I can easily imagine
    a script or style like Sutterlin having a special ligated form for u
    umlaut, but that this ligature must not be used, rather two dots should
    be written above the letter as in normal Latin script, in the name Saül
    in which the dots represent a diaeresis rather than an umlaut.

    OpenType etc fonts are currently able to make these distinctions
    consistently, with the mechanisms John described above; but these
    mechanisms fail if there is a need for the ligature to be optional, as
    ZWNJ and ZWJ cannot be used.

    Are there any real examples where this might be necessary?

    As this is a more general issue, I am coying it back to the main Unicode
    list.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Tue Dec 30 2003 - 16:59:04 EST