RE: Arabic aleph representation of glyphs

From: CE Whitehead (cewcathar@hotmail.com)
Date: Sun Mar 28 2010 - 15:03:38 CST

  • Next message: António MARTINS-Tuválkin: "Re: ISO 15924"




    Hi!

    I still have questions about line-breaking and collation for the tanween-al-fatah (Unicode ً)* when seated on the aliph (Unicode ا)*
    (* the tanween al-fatah -- in an Arabic word -- can only sit on/be combined with the aliph or the tah marbutah at the end of the word;
    it sits above and slightly -- in an rtl context -- to the right of the aliph, above and -- in an ltr context -- slightly to the left of the tah-marbutah).


    The tanween-al-fatah is classified by unicode as a non-starter character -- a combining mark, as far as I can tell.


    However, I had the opinion that it was traditionally typed (on the old manual typewriters)
    prior to the aliph.

     

    (1), I did read
    (in http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries)
    that ". . . a single combining mark is a (degenerate) combining character sequence"


    There is also something called 'prepending' (what is that? does it apply?).

    I am wondering if it would be possible to map
    a tanween-'al-fatah preceding an aliph at word's end as an irregular sequence?? So that it can have a compatibility mapping to aliph followed by tanween-al-fatah?

    (I am wondering otherwise how it works with Line Breaking Rule 9 [LB9 in http://www.unicode.org/reports/tr14/proposed.html#BreakingRules]:
    "LB9 Do not break a combining character sequence; treat it as if it has the line breaking class of the base character in all of the following rules.
    . . .
    "At any possible break opportunity between CM and a following character, CM behaves as if it had the type of its base character.")

     

     

    (2), I do think however that the tanween-'al-fatah should be sorted the same as -- and generally matched to in a search -- the aliph - tanween-al-fatah sequence
    that is it needs to be re-ordered somehow even though the tanween-al-fatah is a non-starter character
    in order for texts to sort properly

    (However I think searching should generally in Arabic match consonants with or without diacritics and tanween-'al-fatah is just a diacritic -- but I don't do searches in Arabic generally.)

     

    (3), Regarding security I don't see a terrible problem though I had trouble viewing all the characters at:

    http://www.unicode.org/reports/tr36/idn-chars.html
    (However, I can't read the characters
    it seems tanween-'al-fatah is only allowed above the tah-marbutah
    and with the aliph?
    It seems vowels are disallowed except in remapped compatibility
    -- which I thought were to be shunned --
    this means that tanween-'al-fatah by itself is disallowed.)
     
    Because the tanween 'al-fatah apparently only occurs in the remapped compatibility with the aliph or tah-marbutah apparently, there should not be security issues related to the fact that

    it displays about the same whether it sits on the aliph or the preceding character;

    although it would be nice if addresses with remapped compatibility diacritics were bundled with addresses
    with straight consonants/seats only.

     

    (NOTE: I also checked out the discussion online in Arabic -- that you all pointed to sometime back -- on tanween-al-fatah when seated on the 'alif --
    if anyone wants to help translate/explain it
    (it was at:
    http://www.ahlalhdeeth.com/vb/sendmessage.php)

    but I can no longer access the link:

    åËäÇë: (ÈêÊëÇ) Ãå (ÈêÊÇë)¿
    example: bayt-a-n
    or
    bayt-a-n ? (however the placing of the tanween 'al-fatah -- the two little slashes above the word -- is slightly different

    èäãæ ÇäÃãËÑêæ Ùäé Ãæ ÇäÊæèêæ êèÖÙ Ùäé åÇ âÈä ÇäÃäá
    . . . However/but the greater about ? that the tanween sits on not before the alif

    èäÐäã äÇ ÊÌÏ áê ÊÍâêâÇÊ ÇäâÏåÇÁ åæ ÇäåÍââêæ ÅäÇ èÖÙçÇ âÈä ÇäÃäá
    And because of this she/? is not found upon the qadma' ancients/veterans' investigation of the truth except that its seat
     its seat was before/in front of the alif (in a right to left context)

    ÇäÐê êÍÓå ÇäÎäÇá áê æØÑê
    which severs/terminates incongruity on/about speculation/theory.

     Ãæ èÖÙçÇ Ùäé ÇäÃäá êÌÑñÏ ÇäÍÑâ ÇäÓÇÈâ åæ ÇäÔãä¬ åÙ Ãæç ÃÍâ ÈÇäÔãä
    that its seat is on the alif severs/strips { next word is a typo; should be 'al-h.arf -- 'character;' not 'al-h.arq, 'burning/rubbing together') the previous letter from the figure,
    although it is more entitled to the figure.

    èäãæ ÈÙÖçå êÔãä ÇäÍÑá ÇäÓÇÈâ ÃêÖÇ áêÖÙ áÊÍÉ Ùäêç
    However after them it forms the previous character also so that the fatah sits upon it.

    { ? does this mean that the tanween-'al-fatah sits upon both the preceding character and the aliph simultaneously??? })


    Best,

    C. E. Whitehead
    cewcathar@hotmail.com
                                                   



    This archive was generated by hypermail 2.1.5 : Sun Mar 28 2010 - 15:07:04 CST