Assimilating damma

From: Gregg Reynolds (greynolds@greynolds.com)
Date: Wed Jul 21 1999 - 07:24:39 EDT


Arno pointed out in a message last week that a form of tanween marking
appeared to be missing from Unicode. It looks like he's right.

There are two forms for each of fathatan, dammatan, and kesratan. They
have different semantics (in Arabic).

Unicode has the "standard" tanween letters, which denote "an" (U+064B),
"un" (U+064C), and "in" (U+064D) phonetically. They also have a
grammatical denotation (indefinite noun) and a lexigraphic role
(word-ending delimiter). For example,

        kitAbun - a book
                  ^^ the 'un' termination is a single letter in
Arabic.

The "assimilating" form of these letters has the same grammatical and
lexigraphic meaning, but phonetically denotes the assimilation of the
/n/ with the following consonant, and has a slightly different visual
form. This occurs when the word is followed by a word beginning with
ya, ra, mee, lam, or nun. The purpose of the mark is to indicate proper
pronunciation; it is used in Quranic text. I'm not sure if it is used
elsewhere or not, but it would certainly be useful for annotating text
for learners, poetry, etc.

For example, "kitAbun mubeen" is a common phrase in the Quran meaning "a
clear book". The proper pronunciation of this is "kitAbummubeen" - the
nun is assimilated into the following meem. This is indicated
orthographically by using the assimilating dammatan. So if we use "un"
to mean "standard dammatan" and "u-n" to mean "assimilating dammatan",
the same word, book, might be written

        kitAbun kabeer - a big book
        kitAbu-n mubeen - a clear book

It looks to me like these three "characters" should be included in
Unicode. Would it help to have feedback on this from the community?
There are a few mailing lists devoted to Arabic; I could conduct an
informal survey if it would help the editors. I'll see about finding a
good graphic sample.

Sincerely,

Gregg Reynolds



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT