Arabic Jazm Urdu

From: CE Whitehead (cewcathar@hotmail.com)
Date: Sat Mar 12 2011 - 01:25:20 CST

Next message: Philippe Verdy: "Re: Case mappings"

Previous message: Vinodh Rajan: "Arabic Jazm Urdu"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hi.
From: Vinodh Rajan (vinodh.vinodh@gmail.com)
Date: Fri Mar 11 2011 - 12:40:34 CST

> Hi,

> I found a proposal for the character /Arabic Jazm Urdu/ here :
> http://std.dkuug.dk/jtc1/sc2/WG2/docs/n2413-4.pdf

> Ishida's page on Urdu ( http://people.w3.org/rishida/scripts/urdu/ ) recommends
> that, the /Jazm/ be handled a font variant of the Arabic /Sukun/.

> (The other character *not* recommended by Ishida, but supposedly having the
> same form is U+06E1.
It seems that U+0652 is recommended for both the sukun/jazm normally (with language-specific styling applied I guess; the exception would be the Q'uran where both U+06E1 and U+0652 are used; and of course the Q'uran is in a single language, Arabic).
> Although, the shape seems to be different than that of
> the /Jazm/ in Govt. of Pakisan's proposal )
It looks like the French "circonflex" apparently:
http://users.skynet.be/hugocoolens/newurdu/specials.html
(It apparently can either be on its side or look like the circonflex; so there are two forms/glyphs for jazm; sukun looks like a tiny o or 0 but again is a diacritic; so altogether there are three different shapes here.)
> Why was this character not encoded ?

> Is it recommended by the Unicode that the Urdu /Jazm/ be handled as a glyph
> variant of the Arabic /Sukun/. ?
I personally don't see a reason to encode it separately in idns.
(In fact the three dots above an Arabic tah with dots look like a circonflex too, and so this might be confusing to have jazm too in idns.
However, I don't know why some urdu characters are encoded separately from Arabic and some not -- as the numbers can be a security issue for idns too if care is not taken, yet these are encoded separately; but I personally would not allow jazm character in idns;
Unicode apparently -- in 2002 -- needed to see more research before encoding the Urdu jazm:
http://unicode.org/consortium/utc-minutes/UTC-091-200205.html .)
For the Q'uran, perhaps a separate character could be encoded; but I feel it would be best to ask native speakers of Arabic what they think.

Normally jazm and sukun have the same use apparently; they both indicate the absence of a vowel and are different than "null."

(http://www.cle.org.pk/clt10/papers/Automatic%20Diacritization%20for%20Urdu.pdf :
"Though Null diacritic may be
generally confused with Jazm (which marks absence
of Zer, Zabar or Pesh on consonants), the two are
quite distinct. Null is used to indicated absence of a
diacritic on a consonant before long vowels, thus on
onset consonants in a syllable, whereas Jazm is used
on consonants to indicate that they are coda position
in a syllable (see [6] for discussion on Urdu syllable
structure).")

A sukun-like character is used instead to mean "null" in the Q'uran it seems, however.

According to SIL's info, Quranic requirements are in some ways met with U+06E1 and U+0652,
except that U+0652 corresponds in the Q'uran to "ignore this consonant" (null) whereas normally in the Arabic language it indicates, like the Urdu jazm, no vowel:
ftp://ftp.sil.org/unicode/arabic_marks/Arabic_Marks.doc.pdf

(It seems to me that the Unicode notes on these two characters are not quite clear.)

Elsewhere, outside of the Q'uran you can change your font variant for the urdu text, using language styling.

(Hope this helps anwer your question; sorry I can't do better.)

Best,
--C. E. Whitehead
cewcathar@hotmail.com

Next message: Philippe Verdy: "Re: Case mappings"
Previous message: Vinodh Rajan: "Arabic Jazm Urdu"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Mar 12 2011 - 01:30:48 CST