Re: Arabic Script: A new Hamza is required for Urdu and Sindhi

From: Gregg Reynolds (unicode@arabink.com)
Date: Sun Sep 18 2005 - 08:17:45 CDT

  • Next message: Doug Ewell: "French accented letters (was: Re: Monetary decimal separators)"

    Hi Lateef,

    Sorry for not responding directly earlier. My initial response is that
      you've got it exactly right, and I would encourage you to proceed. It
    looks to me like you have identified a legitimate need, and your
    proposed solution is consistent with Unicode principles. But I have
    some questions:

    Lateef Sagar wrote:
    > Hi,
    > Gentlemen, my question was about including a new Hamza for Sindhi and
    > Urdu. The issue with teh marboota might be solved easily if a medial
    > shape is introduced. But for my initial question, it may not, because
    > all the Hamza's in Unicode are used in Arabic and introducing a separate
    > shape for existing Hamza's will definitely cause issues.
    > The Hamza that is taught to children in schools, for Sindhi and Urdu is
    > just one, with the shapes that I mentioned in my earlier email. If you
    > like I can post a scanned picture of text books for clarification. All
    > other characters, required for Sindhi and Urdu are very well supported
    > by Unicode.

    I think it would be useful if you could post some images on a website.

      Local printing industry is using Unicode based software and
    > web development houses are making Unicode based Sindhi and Urdu web
    > sites. But as I mentioned in my early email, sorting and searching is
    > now creating problem with words with Hamza, and particularly verbs and
    > their forms. In Arabic Grammar there might be a rule for replacing teh
    > marboota with teh, when required in initial or medial forms, but since
    > there is only one Hamza in Urdu and Sindhi and there is no Hamza above
    > yeh in these languages, therefore no such rule exist in Sindhi and Urdu
    > Grammar that says to change the isolated hamza with hamza above yah
    > when initial or medial forms are required.
    > I want to know your opinion so that I can push Urdu Language Authority,
    > Sindhi Language Authority and CRULP to submit a detailed proposal for
    > new hamza.

    Some terminology you might find useful: in Arabic, there is only one
    Hamza character. The various Unicode "characters" that include the
    hamza *form* are not characters in Arabic; they are combinations of a
    "seat" and a hamza. The seat has no semantics; it is purely graphical,
    although choice of which form to use as seat is governed by various
    orthographic/morphologic principles. In particular, it has no effect on
      lexical ordering. So in your "Hamza Issue.pdf" it might be clearer if
    you say e.g. "the shape of the seat of the hamza" instead of "the shape
    of the hamza". In Arabic, at least, the shape of the hamza itself never
    changes.

    But in Arabic there is no reliable rule (or at least no relatively
    simple rule) that can be used to determine which seat to use.
    Furthermore, in different grammatical forms of the same word, hamza may
    take different seats. For example, a final hamza-on-alif may become
    hamza-on-ya or hamza-on-waw if a suffix is added to the word. But the
    semantic element is the hamza; even if the seat changes, the hamza is
    used for lexical lookup and sorting.

    So in Arabic, hamza-on-waw and hamza-on-ya, for example, are like other
    shaping characters, in that the seat (waw or ya) takes four different
    contextual forms, so given a hamza-on-seat character in context you can
    find which shape to use. However, you can't find which *seat* to use
    given a context change - this is orthography dependent.

    It sounds like Urdu and Sindhi, by constrast, don't vary the seat, so
    shape selection is predictable. In other words, a hamza in a give word
    will always use the same context-dependent seat. A hamza with no seat,
    as in column 2 of the table at the top of page 2 of your document, will
    always change to hamza with ya seat and not hamza with waw or alif seat
      when Rnoon is suffixed.

    Is that correct? What about other uses of hamza?

    I hope that helps.

    -gregg



    This archive was generated by hypermail 2.1.5 : Sun Sep 18 2005 - 08:19:45 CDT