Re: kurdish sorani

From: Andries Brouwer (aebr@win.tue.nl)
Date: Mon Aug 28 2006 - 12:14:41 CDT

  • Next message: Andries Brouwer: "Re: kurdish sorani"

    On Mon, Aug 28, 2006 at 07:30:41PM +0330, Behnam Esfahbod wrote:
    > Hi Andries,
    >
    > Some features you need are font-level features. Take a look at how
    > SIL International did them in Scheherazade and Lateef fonts:
    >
    > http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=ArabicFonts

    I see that this page correctly shows the shaping of Kurdish H
    (and does not mention Kurdish E).
    But this is not an answer - one needs a way to write Kurdish E and
    a way to write Kurdish H, entirely apart from any font that might
    be used later to display the text. What Unicode characters are used
    for Kurdish E? For Kurdish H?

    > On the other hand, I got an email from Erdal Ronahi, who wants to
    > write an XKB symbol file, which puts ARABIC HEH+ZWNJ with one key
    > press, which is not the right way, as you may already put an SPACE
    > after HEH, so the ZWNJ will be useless and makes the text hard to
    > process, search, etc.

    Coding Kurdish E with U+0647,U+200c is certainly a possibility,
    in fact I did precisely this: force the right shapes using ZWNJ.

    No particular coding makes text hard to search - indeed, if
    everybody uses the same convention, that makes searching easy.
    So, the Unicode standard must come with some specific recommendation
    about how to code Kurdish. Be it using ZWNJ or using some other
    mechanism, possibly using new code points.

    If the construction using ZWNJ is recommended, then a Kurdish-aware
    renderer should be taught about this coding - right now I get
    the right shapes, but the surrounding spacing is not optimal.

    > Would you write more description about the usage of these characters
    > please? May you have SPACE after Kurdish E? Do you have the SPACE
    > all the time? etc...

    Kurdish H occurs in all positions (initial, medial, final).
    Also Kurdish E occurs in all positions, but in initial position
    it is preceded by "yeh with hamza", so one could say that
    typographically it does not normally occur in initial position.

    So, yes, a space can follow both Kurdish E and Kurdish H.

    Andries

    >> ARABIC LETTER HEH (U+0647) is a letter with 4 glyph forms.
    >> In Kurdish (written in the Sorani, essential arabic, alphabet)
    >> one has two letters (let me call them Kurdish H and Kurdish E)
    >> and these 4 glyph forms become the two forms of Kurdish H
    >> and the two forms of Kurdish E.
    >> Initial and medial Heh are forms of Kurdish H, final and
    >> independent Heh are forms of Kurdish E.
    >> Kurdish E never joins to the following letter, so needs only two forms.
    >> Initial, final and independent Kurdish H are all written with initial Heh.
    >>
    >> I wonder what the correct way is to write Kurdish in Unicode
    >> (without using language tagging).
    >> Are new Unicode code points needed? Do these exist already?
    >>
    >> Using the Arabic Presentation Forms just for Heh does not work well,
    >> since shaping of neighbouring characters will be wrong.
    >> Writing all in presentation forms cannot be the correct solution.
    >>
    >> Andries



    This archive was generated by hypermail 2.1.5 : Mon Aug 28 2006 - 12:18:52 CDT