Re: Arabic aleph representation of glyphs

From: Maher Alnubani (maher.al-nubani@oracle.com)
Date: Fri Mar 05 2010 - 17:45:54 CST

  • Next message: Khaled Hosny: "Re: Arabic aleph representation of glyphs"

    On 3/5/2010 2:11 PM, CE Whitehead wrote:
    > Hi, thanks very much. You did answer my questions. I still have one
    > more question: would any literate Arabic speaker always type the
    > tanween al-fatah logically after the aleph seat?
    Yes.
    > (Because of course the tanween al-fatah, unlike Arabic vowel
    > diacritics elsewhere, should precede the aleph consonant seat in a
    > visual display and not follow it--that is, in an rtl context, it
    > should be displayed slightly to the right of the aleph--that is how I
    > was taught and indeed how it appears in the combined character in the
    > Unicode extended characters, and indeed that is how it appears when I
    > type it in following the aleph [and of course, it appears this way
    > when I type it in before too].)
    Well, TANWEEN AL-FATH normally appears on top of the ALEF (or the TAH
    MARBUTAH) not before or after it. But, in standard Arabic writing
    TANWEEN is the last thing to write in a word.
    >
    > I also added a few notes below.
    >
    > ------------------------------------------------------------------------
    > Date: Fri, 5 Mar 2010 11:14:40 -0800
    > From: maher.al-nubani@oracle.com
    > To: cewcathar@hotmail.com
    > CC: unicode@unicode.org; ntounsi@gmail.com; rm459@cam.ac.uk;
    > prilop4321@trashmail.net
    > Subject: Re: Arabic aleph representation of glyphs
    >
    > > I hope I was able to answer your questions. Please see my comments
    > below.
    > Thanks.
    > On 3/4/2010 5:16 PM, CE Whitehead wrote:
    >
    >
    >
    > Hi! The chart you provided had two parts: first the Arabic
    > alphabet; second, the vowel diacritics, not alone, but in the
    > company of consonants.
    >
    > So for someone learning Arabic letters the link you sent has some
    > use.
    >
    > (I wish I could say it helped me with unicode characters; I see
    > that there are some combining characters represented in the
    > Presentation Forms at the Unicode code charts, and those are what
    > I wanted I think.
    >
    > But for anyone learning Arabic also here's a link I found where
    > you can learn about why certain characters have different glyphs:
    > http://www.abjad.com/pyramid.htm there is also:
    > http://www.funwitharabic.com/alphabet.html where you can meet the
    > characters in order, and there is a song too)
    >
    > However, what I was trying to ask about was primarily a display
    > question perhaps.
    >
    > BACKGROUND:
    >
    > There are versions of the Arabic vowel diacritics associated with
    > the indefinite case endings, which actually consist of the short
    > vowel plus the -n sound at the end, and these come only at the end
    > of words--and in fact, only at the end of words that are
    > 'indefinite' or 'not determined' by the article 'al.'
    >
    > You don't have to write the diacritics in Arabic, only the
    > consonants (so these diacritics are secondary and more like accent
    > marks and such in Latin-1). The problem comes with the indefinite
    > accusative however, fathatan,
    > because you have to insert an unspoken/not-pronounced alef as a
    > seat for the diacritic and the alif has to be written of course.
    >
    > (Similarly, there is a 'consonant,' the hamza it is called, which
    > is the glottal stop, that often takes a seat; unlike the seated
    > fathatan diacritic for the accusative indefinite -- the seated
    > hamzas are represented in the primary characters chart at:
    > http://www.unicode.org/charts/PDF/U0600.pdf
    >
    > You can have the hamza alone and also represented with different
    > seats: 0621-0626 -- although one of these characters actually
    > involves a suppressed hamza -- or whatever [the hamza is
    > suppressed when it comes between two vowels; I think I've got this
    > right?] for 0622 [is this right?].
    >
    > This group might actually be considered to consist of combined
    > characters since all but 0621 include both a diacritic and a
    > character seat for it.
    >
    > The vowel diacritics are represented here in isolation, also on
    > this page but not with seats.
    >
    > You represent the vowel diacritic fathatan with aleph [or
    > alternately it's written alif] elsewhere in the supplements
    > [Presentation Forms-A] and the hamza diacritics as well
    > [Presentation Forms-B].
    >
    > [On the main page again, see 0627 - 064A for the primary
    > consonants if you want those; those are the characters that have
    > to be typed, that I consider primary.])
    >
    > But of course the only time the inflectional ending needs a seat
    > is when it is in the accusative case; otherwise it is just a
    > diacritic at word's end!
    >
    > * * *
    > Now . . . for my questions:
    >
    > (1), The logical typing order for the vowel diacritic for sure is
    > normally first the consonant seat and then the vowel
    > diacritic--although the vowel diacritic appears above or below the
    > consonant and not in rtl order.
    >
    > However, at the end of the word, with the inflectional ending, you
    > don't have alternate ways of writing the vowel and its character
    > seat; so whether you type the vowel diacritic before or after the
    > alif that serves as a seat, there should be only one display
    > possibility as far as I can think (I may be wrong).
    >
    > But my browser (IE) displays the vowel-aleph combo differently
    > depending on typing order -- and I don't think it should in this
    > case since this diacritic is an end of word character -- someone
    > straighten me out on this. I'm sending the attachment again
    > (renamed because the name was confusing because I call this a
    > double vowel diacritic because there are two slashes and not one
    > but it's not really a doubled vowel): on the attachment, you can
    > see the characters together and the two different typing orders.
    >
    > (Maybe typing order matters?--someone correct me.)
    >
    >
    > > Yes, logical typing order does affect the visual display. Generally,
    > Tanween Al-FATH (what you called fathatan) would be > the last thing
    > typed in a word. If you type it before the Alef, the renderer would
    > super impose it on the previous letter not > the Alef.
    >
    > me] Yes, normally for me the diacritic would be typed after the
    > consonant seat; I guess I sometimes type tanween al-fath before the
    > consonant seat because in this case the diacritic (tanween al-fath)
    > should appear to be slighlty preceding--that is to the right of in an
    > rtl context--the aleph seat. However, from what you say my typing the
    > characters in this order is an error (and would mess up
    > line-breaking). (Thus are you saying that any literate Arabic speaker
    > would always type the tanween al-fath last? Also it is my
    > understanding that the tanween character is only used at the end of a
    > word, as an inflectional ending that indicates a noun or adjective is
    > indefinite, and belonging to a particular case; thus it would be
    > bizarre to associate the tanween al-fath with a character that
    > preceded the aleph.)
    >
    > * * *
    > (2), Also, further down in my attached page, the tah-marbutah is
    > an end-of-word character, and I expected it to turn into an
    > ordinary tah when I added an inflectional ending since in Arabic
    > an ordinary tah must precede the inflectional ending; but the
    > character remained a tah-marbutah; you can add inflectional
    > endings to it and so I am wondering: shouldn't it display like an
    > ordinary tah when there is an inflectional ending afterwards? (Do
    > you code it as in someway an allo-glyph of tah?)
    >
    >
    > > As you stated, Tah-Marbutah is an end-of-word character, and is a
    > different letter from the Tah. Tanween vowels (what > you call
    > inflectional ending) would super impose on it the same way they would
    > on the Alef for Tanween Fateh. But, you
    > > do not need to add an Alef for Tanween Al-Fateh when a word
    > (normally a noun) ends with Tah-Marbutah.
    >
    > Displaying it like tah before an inflectional ending would look
    > Arabic. (Someone is going to argue with me and say that I should
    > have typed a tah and not a tah-marbutah anyway before the
    > inflectional ending but I would first type the word, then the
    > tah-marbuta, then perhaps later add in my voweling.)
    >
    > > Again, Tah Marbutah letter and Tah letter are two different letters.
    > Maybe you are confusing it with the Ha letter. Ha at
    > > the end of the word would look like the Tah marbutah but without the
    > two dots above. When you add a letter after the
    > > Ha, the Ha would connect to it. In Arabic, however, you would never
    > find a word that ends with a Ha and at the same
    > > time have a Tanween ending. When Ha connects to a name, it makes it
    > a definite name (similar to adding AL). Definite
    > > names wound not accept Tanween as an ending.
    >
    > My mistake; sorry! But for some reason I have harbored a view
    > that tah-marbutah is an alternate form of tah, which appears at the
    > end of a feminine noun or an adjective in the feminine, but before the
    > inflectional/tanween ending -- because (according to the Arabic I
    > learned; hope I learned right) tah-marbutah is pronounced like a tah
    > once the inflectional ending is added (I just looked this up and this
    > is the reason for the name tah-marbutah but there is no association
    > with tah or ha; and I suppose that the inflectional endings are often
    > dropped in speaking so that the tah-marbutah is not pronounced as a
    > tah often: I don't pronounce correctly anyway). My mistake again in
    > saying I would still need the aleph seat for the tanween al-fath.
    >
    >
    > Thanks again for your info.
    >
    > Best,
    >
    > C. E. Whitehead
    > cewcathar@hotmail.com <mailto:cewcathar@hotmail.com>
    >
    >
    >
    >
    >
    > > Date: Thu, 4 Mar 2010 18:56:41 +0100
    > > From: prilop4321@trashmail.net <mailto:prilop4321@trashmail.net>
    > > To: unicode@unicode.org <mailto:unicode@unicode.org>
    > > CC: cewcathar@hotmail.com <mailto:cewcathar@hotmail.com>
    > > Subject: Re: Arabic aleph representation of glyphs
    > >
    > > Dear CE Whitehead:
    > >
    > > Your messages are confusing and I don't really understand
    > > what you mean and what you want.
    > >
    > > But have a look at
    > > http://www.user.uni-hannover.de/nhtcapri/arabic-alphabet.html
    > > Perhaps this page will help you understanding the Arabic script
    > > in Unicode.
    > >
    >
    >
    > ------------------------------------------------------------------------
    >
    > ًا
    >
    > 064B 0627
    >
    > ------------------------------------------------------------------------
    >
    > اً
    >
    > 0627 064B
    >
    > ﴼ
    >
    > FD3C
    >
    > ------------------------------------------------------------------------
    >
    >
    > ABOVE: the aleph with the double (for an indeterminate ending)
    > fatah diacritic, varying logical order; followed by the
    > presentation form.
    >
    >
    > BELOW: the tah marbuta connected to a following aleph with
    > double fatah diacritic, varying logical order for the aleph
    > and fatah diacritic.
    >
    > ةًا
    >
    > 0629 064B 0627
    >
    > ------------------------------------------------------------------------
    >
    > ةاً
    >
    > 0629 0627 064B
    >
    > ------------------------------------------------------------------------
    >
    > Note: as you can see, everything displays as it should regardless
    > of when/where you type the vowel diacritic logically--except the
    > change in logical order should not, in my opinion, change the
    > display appearance in any wayl; also if you have any problems with
    > the display all you need to do is add a meta tag stating the
    > document character set; I think it's o.k. though as it's actually
    > encoded ansi and I put in numbers of course. In any case, you may
    > wish to check this display in different browsers--I'm not sure
    > what's making the two orderings display differently--something
    > about the unicode characters or something to do with the browser
    > implementation? Thanks.
    >
    >
    > Below: Character Input Order Sometimes Does Matter; It
    > Suppresses Display Altogether
    >
    > darrasa 'to teach'
    >
    > دَرَّسَ
    >
    > دَرَّسَ
    >



    This archive was generated by hypermail 2.1.5 : Fri Mar 05 2010 - 17:50:07 CST