Chairless/Amphibious hamza (was: missing chars for Arabic (sequential tanween))

From: John Hudson (
Date: Thu Dec 20 2007 - 22:42:50 CST

  • Next message: John Hudson: "[OT] Re: CLDR Usage of Gregorian Calendar Era Terms: BC and AD -- Can we please have "CE" and "BCE" ?"

    arno wrote:

    > > Chairless Hamza
    > > ... is a non-disjoining character.
    > > This means, when it comes in between two joinable characters,
    > > it doesn't separate them. An example for the behavior of Quranic
    > > Hamza, is the word /a'aadam/ in Q2:31, 33, 34.

    > I do not see the need for this character.
    > Why not use U+0621 ARABIC LETTER HAMZA for a leading hamza (as in
    > /a'aadam/ in Q2:31) just as one uses it for a trailing one (as in
    > /sawāʾun/ in Q2:6)?
    > Why not use U+0654 Arabic Hamza Above for a non disjoining hamza between
    > two jouned characters?

    As I understand it, the positioning of the chairless hamza and of U+0654 is not the same.
    The chairless hamza, though above the joined letters, is positioned near to where they
    join, while U+0654 entered in sequence between the letters will be placed on the first
    letter (following the basic Unicode rule that marks follow the letter to which they are

    > Why not use U+0674 ARABIC LETTER HIGH HAMZA for the hamza BEFORE the
    > second stroke of lam-alif?

    U+0674 is needed to form digraphs for Kazakh, which means that it doesn't really behave
    like a combining mark. Indeed, depending on the typeface design, it may be a spacing
    character. This may result in breaking of lam_alif ligation if U+0674 is inserted between
    them; this could be resolved with contextual glyph handling if necessary, but may cause
    issues for some existing fonts.

    I am also not very happy with the idea of using different codepoints for the same letter
    based on its positioning and behaviour relative to other letters: this undermines the
    Unicode character glyph model, and I believe positioning is a display issue.

    The chairless hamza -- what Tom Milo calls the 'amphibious hamza' because sometimes it is
    floating and sometimes it is sitting, depending on context -- is a topic I have given a
    lot of thought recently. It is problematic because the modern use of the letter as a
    disjoining character, based on several generations of typesetting technology, has bred an
    expectation among modern readers, who find it normal that this letter is disjoining. This
    sort of thing happens all too easily when the dominant technology for typesetting a script
    introduces novel behaviours for reasons of technical limitations.

    So I'm thinking about the issue in terms of being able to satisfy different user
    communities: those who want the traditional amphibious hamza and those who expect the
    disjoining hamza. Returning to my point above: I think this can be looked at as a display
    issue, in which case the question becomes whether font formats and layout engines have
    suitable mechanisms to handle the contextual behaviour. OpenType does, and I believe
    Apple's AAT and SIL's Graphite do also. Tom Milo's ACE technology certainly does, as he
    has already implemented this.

    In an ideal world, the amphibious hamza would never have developed a modern usage as a
    disjoining character, and Unicode would encode it appropriately. As it is, Unicode has
    inherited a typesetting model that is at odds with the script tradition in a number of
    ways, and we need to figure out ways around these issues (especially since Unicode is
    circumscribed by stability agreements that prevent change in many areas).

    Generally speaking, when one is faced with different user communities, with different
    expecations and practices, or even with individuals who have different preferences
    depending on the kind of text they are working with and the style of typeface design they
    are using, one has to start looking for solutions in the framework of glyph processing
    technologies, font formats and layout engines, rather than at the character encoding level.

    I have, by the way, an illustration of the amphibious hamza issue online as a result of a
    previous discussion:

    John Hudson

    Tiro Typeworks
    Gulf Islands, BC
    At the sunset of our days on earth, at the moment of
    death, we will be evaluated on the basis of our similarity
    or otherwise with the Baby who is to be born in the poor
    grotto of Bethlehem, since it is He who is the standard
    of measurement which God has given to humanity.
                        -- Benedict XVI

    This archive was generated by hypermail 2.1.5 : Thu Dec 20 2007 - 22:46:48 CST