Re: kurdish sorani

From: Behnam (
Date: Tue Aug 29 2006 - 20:56:45 CDT

  • Next message: John Hudson: "Re: kurdish sorani"

    I don't want to hijack this thread for a non Kurdish issue but since
    it might also be beneficial to Kurdish case, I simply say that
    Unicode didn't resolve the issue of U+0647. It did however,
    recognized that this letter has five forms and not four. The fifth
    form has no contextual behavior whatsoever therefore it requires a
    separate code. But since this exceptional case doesn't fit the
    general pattern of 'one code for one letter>related contextual
    shapes' , it did not honor this exception by assigning a separate
    code for a form of the same letter.
    But it did apparently recognize that two isolated forms can't live
    together under one single code and then scratched one of them... the
    good one.
    And the reasoning behind this choice seems pretty lame to me. Nobody
    chooses small alpha instead of Greek capital alpha (or whatever) to
    avoid similarity with English A.
    U+0647 IS wrongly represented because that shape doesn't join to
    anything. How can it be representative of double joining Arabic
    letter heh?
    To understand this, you must understand that the shape of initial
    form of Arabic heh has nothing to do with the shape of two eyed
    isolated form. The visual similarity is superficial and the
    functionality is completely different.


    On 29-Aug-06, at 8:57 PM, Kenneth Whistler wrote:

    > Not being an Arabic script expert, I cannot comment
    > meaningfully on the details of Kurdish shaping or the
    > other claims in this thread, but...
    >> This is not a Persian letter issue. It's Arabic letter U+0647 issue
    >> for Arabic, old Turkish, Persian.. and now perhaps Kurdish and there
    >> may be more.
    >> What is called two eyed initial form is only used as initial form and
    >> doesn't need a control character.
    >> What is produced by control character is only because Unicode doesn't
    >> allow any other option but the real intended shape,
    > That claim seems to me to be incorrect. The Unicode Standard
    > provides information about Arabic shaping, but there is
    > certainly nothing in the standard which "doesn't allow any
    > other option" -- including doing the "right thing" when shaping
    > for Kurdish or some other language using the Arabic script.
    > The encoded presentation forms for Arabic and for Urdu are
    > simply compatibility forms, and should certainly not be
    > taken as constraining how one should shape the actual
    > U+06XX Arabic letters in appropriate contexts. And the
    > joining groups displayed in Tables 8-7 and 8-8 of the
    > standard should *guide* basic Arabic implementations, but
    > again should not be taken as tying anyone's hands from doing
    > proper shaping for various styles or languages using the script.
    >> 'abbreviated
    >> form', which BTW is wrongly presented as U+0647 in Unicode PDF, is
    >> never joined from the left or right.
    > The glyph used in U+0647 was chosen deliberately as of Unicode 2.0,
    > when production constraints no longer allowed the use of more
    > than one representative glyph per character in the chart. Since
    > Unicode 2.0,
    > this choice has always been explained in the text of the
    > standard. See TUS 4.0, p. 204. It is not wrongly presented -- it
    > is merely *a* choice of *a* glyph for HEH, attempting to
    > visually distinguish it from other related letters and U+0665
    > ARABIC-INDIC DIGIT FIVE in the chart.
    > --Ken

    This archive was generated by hypermail 2.1.5 : Tue Aug 29 2006 - 21:03:04 CDT