FATHATAN (was: RE: Presentation of unknowned composited sequences for arabic script)

From: Reynolds, Gregg (greynolds@datalogics.com)
Date: Wed Jul 14 1999 - 15:44:16 EDT


Not sure what you mean by "arabic people", but since I read and speak it I
guess I'll butt in.

1. FATHATAN is not a vowel, it's a vowel plus nunation. It can only come
at the end of a word (with one class of exception I'll describe below); this
should be part of its semantics. At least this is the case for encoding
Arabic text; I don't know if this applies to other literary traditions that
use the Arabic script.

2. When a fathatan terminates a word, it must rest on an alef "seat"
(U+FD3C or U+FD3D, or a lam-alef), unless the terminal consonant of the word
is a teh marbuta, in which case the fathatan sits over the teh marbuta (in
dotted heh form).

3. So if the semantics of fathatan is "fatha vowel + nunation +
[word-terminator]", as it is for a reader of Arabic, the first 3 characters
of your example below would map to [initial beh]+[medial beh]+[final
alef]+[fathatan]; the remaining fathatans would be noise, with no natural
(at least to my eye) representation. If the semantics of fathatan is simply
a pair of marks floating above another mark, then there's not too much to be
done with it. In particular, a fathatan over an initial or medial (or
isolated) form would be confusing (a clear error).

3. But wait, there's more. What if the final consonant is a hamza?
Variant spellings (forms) are possible. The fathatan may rest on a final
alef preceded by a hamza on a (dotless) ya' seat; or it may rest above the
tatweel binding the ya' and the alef. Or it may rest on above the hamza.
It depends on the word.

4. No, we're not finished yet! For some words that end with an alef
maqsura represented by dotless ya', such as fat_A ("_A" symbolizing dotless
ya'), the fathatan rests over the consonant preceding the dotless ya':
fat(an)_A. How should this be encoded?

See notes below too:

> -----Original Message-----
> From: Chookij Vanatham [mailto:chookij.vanatham@eng.sun.com]
> Sent: Monday, July 12, 1999 7:19 PM
> To: Unicode List
> Subject: Re: Presentation of unknowned composited sequences for arabic
> script
>
>
>
> Hi Folk,
>
> Just would like to hear from arabic people.
>
> Thanks,
>
> Chookij V.
>
[snip]
>
> BEH + BEH + FATHATAN + FATHATAN + FATHATAN + FATHATAN +
> FATHATAN + BEH
>
> Moving the diacritic up, wouldn't be enough for lot of
> vowels. ZWJ and ZWNJ
> wouldn't be involved for this situation. I guess that those
> vowels (FATHATAN)
> would be displayed with middle form (FATHATAN middle form
> U+FE77) if we still

There is no middle form of fathatan; U+FE77 is middle form fatha (no --an).
Different beast.

> want these vowels displayed connected each other (for cursive script).

Not sure what you mean by this; vowels don't connect.

Sincerely,

Gregg



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT