Arabic Jazm Urdu

From: CE Whitehead (cewcathar@hotmail.com)
Date: Sat Mar 12 2011 - 01:25:20 CST

  • Next message: Philippe Verdy: "Re: Case mappings"

    Hi.
    From: Vinodh Rajan (vinodh.vinodh@gmail.com)
    Date: Fri Mar 11 2011 - 12:40:34 CST

    > Hi,

    > I found a proposal for the character /Arabic Jazm Urdu/ here :
    > http://std.dkuug.dk/jtc1/sc2/WG2/docs/n2413-4.pdf

    > Ishida's page on Urdu ( http://people.w3.org/rishida/scripts/urdu/ ) recommends
    > that, the /Jazm/ be handled a font variant of the Arabic /Sukun/.

    > (The other character *not* recommended by Ishida, but supposedly having the
    > same form is U+06E1.
    It seems that U+0652 is recommended for both the sukun/jazm normally (with language-specific styling applied I guess; the exception would be the Q'uran where both U+06E1 and U+0652 are used; and of course the Q'uran is in a single language, Arabic).
    > Although, the shape seems to be different than that of
    > the /Jazm/ in Govt. of Pakisan's proposal )
    It looks like the French "circonflex" apparently:
    http://users.skynet.be/hugocoolens/newurdu/specials.html
    (It apparently can either be on its side or look like the circonflex; so there are two forms/glyphs for jazm; sukun looks like a tiny o or 0 but again is a diacritic; so altogether there are three different shapes here.)
    > Why was this character not encoded ?

    > Is it recommended by the Unicode that the Urdu /Jazm/ be handled as a glyph
    > variant of the Arabic /Sukun/. ?
    I personally don't see a reason to encode it separately in idns.
    (In fact the three dots above an Arabic tah with dots look like a circonflex too, and so this might be confusing to have jazm too in idns.
    However, I don't know why some urdu characters are encoded separately from Arabic and some not -- as the numbers can be a security issue for idns too if care is not taken, yet these are encoded separately; but I personally would not allow jazm character in idns;
    Unicode apparently -- in 2002 -- needed to see more research before encoding the Urdu jazm:
    http://unicode.org/consortium/utc-minutes/UTC-091-200205.html .)
    For the Q'uran, perhaps a separate character could be encoded; but I feel it would be best to ask native speakers of Arabic what they think.
     
    Normally jazm and sukun have the same use apparently; they both indicate the absence of a vowel and are different than "null."

    (http://www.cle.org.pk/clt10/papers/Automatic%20Diacritization%20for%20Urdu.pdf :
    "Though Null diacritic may be
    generally confused with Jazm (which marks absence
    of Zer, Zabar or Pesh on consonants), the two are
    quite distinct. Null is used to indicated absence of a
    diacritic on a consonant before long vowels, thus on
    onset consonants in a syllable, whereas Jazm is used
    on consonants to indicate that they are coda position
    in a syllable (see [6] for discussion on Urdu syllable
    structure).")

    A sukun-like character is used instead to mean "null" in the Q'uran it seems, however.

    According to SIL's info, Quranic requirements are in some ways met with U+06E1 and U+0652,
    except that U+0652 corresponds in the Q'uran to "ignore this consonant" (null) whereas normally in the Arabic language it indicates, like the Urdu jazm, no vowel:
    ftp://ftp.sil.org/unicode/arabic_marks/Arabic_Marks.doc.pdf
     
    (It seems to me that the Unicode notes on these two characters are not quite clear.)
     
    Elsewhere, outside of the Q'uran you can change your font variant for the urdu text, using language styling.
     
    (Hope this helps anwer your question; sorry I can't do better.)
     
    Best,
    --C. E. Whitehead
    cewcathar@hotmail.com
                                                   



    This archive was generated by hypermail 2.1.5 : Sat Mar 12 2011 - 01:30:48 CST