From: Gregg Reynolds (firstname.lastname@example.org)
Date: Sun Sep 18 2005 - 08:17:45 CDT
Sorry for not responding directly earlier. My initial response is that
you've got it exactly right, and I would encourage you to proceed. It
looks to me like you have identified a legitimate need, and your
proposed solution is consistent with Unicode principles. But I have
Lateef Sagar wrote:
> Gentlemen, my question was about including a new Hamza for Sindhi and
> Urdu. The issue with teh marboota might be solved easily if a medial
> shape is introduced. But for my initial question, it may not, because
> all the Hamza's in Unicode are used in Arabic and introducing a separate
> shape for existing Hamza's will definitely cause issues.
> The Hamza that is taught to children in schools, for Sindhi and Urdu is
> just one, with the shapes that I mentioned in my earlier email. If you
> like I can post a scanned picture of text books for clarification. All
> other characters, required for Sindhi and Urdu are very well supported
> by Unicode.
I think it would be useful if you could post some images on a website.
Local printing industry is using Unicode based software and
> web development houses are making Unicode based Sindhi and Urdu web
> sites. But as I mentioned in my early email, sorting and searching is
> now creating problem with words with Hamza, and particularly verbs and
> their forms. In Arabic Grammar there might be a rule for replacing teh
> marboota with teh, when required in initial or medial forms, but since
> there is only one Hamza in Urdu and Sindhi and there is no Hamza above
> yeh in these languages, therefore no such rule exist in Sindhi and Urdu
> Grammar that says to change the isolated hamza with hamza above yah
> when initial or medial forms are required.
> I want to know your opinion so that I can push Urdu Language Authority,
> Sindhi Language Authority and CRULP to submit a detailed proposal for
> new hamza.
Some terminology you might find useful: in Arabic, there is only one
Hamza character. The various Unicode "characters" that include the
hamza *form* are not characters in Arabic; they are combinations of a
"seat" and a hamza. The seat has no semantics; it is purely graphical,
although choice of which form to use as seat is governed by various
orthographic/morphologic principles. In particular, it has no effect on
lexical ordering. So in your "Hamza Issue.pdf" it might be clearer if
you say e.g. "the shape of the seat of the hamza" instead of "the shape
of the hamza". In Arabic, at least, the shape of the hamza itself never
But in Arabic there is no reliable rule (or at least no relatively
simple rule) that can be used to determine which seat to use.
Furthermore, in different grammatical forms of the same word, hamza may
take different seats. For example, a final hamza-on-alif may become
hamza-on-ya or hamza-on-waw if a suffix is added to the word. But the
semantic element is the hamza; even if the seat changes, the hamza is
used for lexical lookup and sorting.
So in Arabic, hamza-on-waw and hamza-on-ya, for example, are like other
shaping characters, in that the seat (waw or ya) takes four different
contextual forms, so given a hamza-on-seat character in context you can
find which shape to use. However, you can't find which *seat* to use
given a context change - this is orthography dependent.
It sounds like Urdu and Sindhi, by constrast, don't vary the seat, so
shape selection is predictable. In other words, a hamza in a give word
will always use the same context-dependent seat. A hamza with no seat,
as in column 2 of the table at the top of page 2 of your document, will
always change to hamza with ya seat and not hamza with waw or alif seat
when Rnoon is suffixed.
Is that correct? What about other uses of hamza?
I hope that helps.
This archive was generated by hypermail 2.1.5 : Sun Sep 18 2005 - 08:19:45 CDT