From: Behnam (firstname.lastname@example.org)
Date: Tue Aug 29 2006 - 16:18:44 CDT
On 28-Aug-06, at 9:27 PM, Andries Brouwer wrote:
> On Mon, Aug 28, 2006 at 04:18:20PM -0400, Behnam Rassi wrote:
>> I agree with John Hudson. Kurdish E can be achieved by U+06D5
> Yes. But then what is Kurdish H?
>> The other problem is with the definition of Arabic Heh itself and not
>> any particular local. Arabic Heh is an exception in that it has five
>> forms. The fifth form is 'abbreviated form' which is a non-joining
>> character used for abbreviation and enumeration.
>> Worse, this form is wrongly presented in Unicode PDF files as the
>> representative of Arabic letter Heh, which indeed should be the oval
>> If the fifth form gets its own code, it may solve the problem in
>> Kurdish and many other languages as well.
> But Unicode does not encode shapes but semantics.
> So if two languages each have a Heh, but the shaping behaviour
> then in principle different code points are required.
> That is why there is U+06CC next to U+064A (and U+0649).
The use of shapes, particularly in 'heh' family, amongst different
languages of Arabic script is very fluid and interchangeable. What is
defined as medial and final forms of heh goal for Urdu language, can
easily be used in Persian or Arabic for that matter. Some believe it
is a calligraphic choice of font maker and I believe it should be an
optional choice of user in encoding. But this is another story. The
point I want to make is, in searching an answer for your question as
'what is Kurdish heh', one should be certain that the shapes of
initial, medial and final forms are not just a matter of optional
taste, but irrevocable rules.
If this is clarified, then yes, I agree with you that Kurdish heh
requires its own code.
> If I understand you correctly, your fifth form of Heh is
> the isolated form that now is commonly represented using
> U+0647,U+200D ?
Yes, which means it is encoded differently from U+0647 anyway so why
not having its own code? And why showing it as representative of
letter heh in Unicode PDF?
This is a practical demonstration of irrevocable rule in Arabic and
Persian languages that this shape is never used within a sentence and
only as an isolated non-joining form for abbreviation and
enumeration. The initial form is used only as a similar shape but it
has a totally different contextual behavior.
The problem with 'U+0647,U+200D' is that it produces visible initial
form and not the real shape of heh dochashme isolated. To rectify
that, I put a substitution glyph for this combination in my fonts.
Fonts that don't have this substitution produce an initial shape
which is calligraphically incorrect and technically, it is still a
combination that joins to its left, and abbreviated form (heh
dochashme isolated) shouldn't.
Incidentally, as it was mentioned in this thread, the four forms
behavior of Urdu letter heh dochashme is disputed. If it is
established that this is a right only joining letter, then it can
more easily be used for abbreviated form. At least it would be a much
better option than 'U+0647,U+200D'.
This archive was generated by hypermail 2.1.5 : Tue Aug 29 2006 - 16:23:20 CDT