Re: Chairless/Amphibious hamza

From: John Hudson (
Date: Fri Dec 21 2007 - 04:54:43 CST

  • Next message: Andr Szabolcs Szelp: "Re: Universal Keyboard"

    arno wrote:

    > Not quite. It is a floating mark that lengthens the connection between
    > the letter before and the letter after -- or in other words: it is a
    > mark above (or below) tatwîl.
    > I agree with you, that storing "amphib Hamza" instead of "tatweel +
    > hamza above" OR "tatweel + hamza below" is more elegant, but much more
    > difficult to interpret.

    Difficult to interpret in what sense?

    If one considers the elongation of letters to be a display issue and not an encoding
    issue, as I definitely do, then there is no question of how this should be done. I don't
    even think the tatweel character should exist, let along be recommended to be used for
    anything. I believe it exists because old metal fonts had a bit of metal with a connecting
    line on it that could be used to crudely elongate the baseline stroke in very horizontal
    typefaces. I don't believe digital text encoding should include such things.

    >> Since the distinction is one of shaping and positioning, determined by
    >> the shaping behaviour of adjacent letters, I believe that this is
    >> properly addressed as a display issue and not as an encoding issue.

    > You state this as a fact.
    > But I wrote earlier, that I disagree.
    > Only when you have a fully vowel text and a clearly defined locale,
    > adjacent chars will determine the position of hamza sufficiently.

    You need more than a locale, in the commonly understood meaning of that term. You need
    something capable of making distinctions at the level of individual editions of the
    Qur'an. That is a typographical level, and you need to be able to plug in to font
    architecture at a level that enables you to make a distinction between different ways of
    displaying identically encoded text. That level is something like the OpenType 'language
    system' tag (which is poorly named, since what it signifies is particular typographic
    conventions, which could be associated with a particular language, a particular community,
    a particular country or, yes, a particular publisher or edition).

    I think we need to make a distinction between orthographic conventions and typographic
    conventions. To me, orthographic convention means a distinctive spelling, i.e. the use of
    different characters. Typographic convention means a distinctive appearance, i.e. the use
    of different glyphs. Keeping this distinction clear makes it easier to analyse what is
    happening in text, and one of the first things to do when looking at differences between
    e.g. different Qur'ans is to determine whether individual differences are orthographic or
    typographic. This isn't necessarily an easy determination to make, and I suspect that what
    we're disagreeing about regarding the hamza is essentially this question.

    >> That is, I do not believe a character-level distinction exists or
    >> should exist between the hamza between two joining letters and the
    >> hamza between two non-joining ketters. The distinction is in the display.

    > I do not believe that a character-level distinction exists or should
    > exist between "fa with a dot above" and "fa with a dot below" or between
    > "kaf with three dots above" and "keheh with three dots above." Locale
    > should handle the proper choice of glyph.

    An interesting parallel.

    >> Those different rules simply mean that we can't expect one font to
    >> satisfy all users, but there is nothing unusual in that.

    > No, I want *one* font for writing an Ottoman and Q24 mushaf,
    > and allows me to write words according to Egyptian AND according to
    > Syrian rules

    Are those rules orthographic or typographic (see above)?

    If they are orthographic, i.e. you are making distinctions in the spelling of the text,
    then making a single font to address them all is possible presuming that all the rules
    follow the grammar of the script.

    But if the distinctions are typographic, i.e. if they require different glyphs to display
    the same text, then it is more problematic. One has to split the glyph display along
    'locale' lines someplace: either at the font level, by making separate fonts for each
    'locale', or at the layout feature level, by defining locale-specific shaping associated
    with appropriate tags.

    > and most users want a font capable of writing the proper
    > hamza in an basically unvowelled context.

    It is this notion of 'proper hamza', in terms of what most users want, that I find a bit
    discomforting. Because I look at this very large number of fonts, including new designs,
    in which U+0621 always interrupts the joining of adjacent letters, and if this is wrong
    I'm concerned that I don't hear much complaint about it.

    [Skipping some stuff that I need to think about more.]

    > I have not seen sample sentences, but just sample words, please instruct
    > me where I can find more.

    In the image

    below the larger illustrations of the individual words are the sample sentences in which I
    found the words, with the words highlighted in red.

    I just did a Google search to locate the sources for sentences containing these words:

    سنؤات انثى وارد من الخارج مواصفات ممتازة سبق لءا الولادة مرة واحدة انجبت 6 كلاب

    وأسلم لءب العالمين تعني أن يعطي ويسلِّم كله لرب العالمين وليس أمره فقط كما هو مطلوب في مستوى
    التفويض. ولتوضيح أنه توجه لطلب الرعاية قال تعالى ( قال أسلمت لرب العالمين ) ولم يقل أسلمت
    لله مثلا، وإنما توجه لصفة الربوبية التي ترعى وتتولى.

    If you tell me that these are both misspellings, I'll give up :)

    > I am afraid that the script traditions are far to complex for you,
    > Thomas and me to grasp.

    If we can't grasp the thing that we're trying to implement, we can't expect to be able to
    implement it. So we must either give up and accept a fundamental rupture between Arabic
    typography and the script traditions, or keep working at the problem, involve more people,
    discuss with manuscript experts and scribes, etc.

    > not true! The Princely Printing House that produced the King Fuad
    > edition had many more ligatures (jîm/ha/xa ligatures among them), but
    > they freely choose not to use them.

    Do you know why?

    >> I don't *like* having to handle the joining behaviour of letters
    >> adjacent to hamza contextually in font lookups. Ideally I shouldn't
    >> have to. But changing the properties of U+0621 so that adjacent
    >> letters are made joining by compliant shaping engines would break a
    >> lot of software and pretty much all current fonts. I can't see Unicode
    >> doing that, so I think we're obliged to look for solutions at the
    >> display level.

    > Does that mean that you do not think that ADDING "amphib/chairless
    > hamza" is a good idea?

    I suspect, so far as I understand the matter, that it might be a 'non-starter', i.e.
    something that won't appeal to the Unicode Technical Committee members, but I could be
    wrong. I may be missing something in the description of the proposed character. How,
    exactly, does it differ from U+0621.

    John Hudson

    (who may not respond until after Christmas, and really must go and sleep now)

    Tiro Typeworks
    Gulf Islands, BC
    At the sunset of our days on earth, at the moment of
    death, we will be evaluated on the basis of our similarity
    or otherwise with the Baby who is to be born in the poor
    grotto of Bethlehem, since it is He who is the standard
    of measurement which God has given to humanity.
                        -- Benedict XVI

    This archive was generated by hypermail 2.1.5 : Fri Dec 21 2007 - 04:56:58 CST