Re: kurdish sorani

From: Andries Brouwer (
Date: Tue Aug 29 2006 - 13:52:29 CDT

  • Next message: Magda Danish \(Unicode\): "Subj: MS outlook encoding & YahooGroups"

    On Tue, Aug 29, 2006 at 10:44:16AM -0700, John Hudson wrote:
    > Andries Brouwer wrote:
    >> Now that I look at Uyghur, it seems that U+06BE is no good for
    >> Uyghur either, so both Kurdish and Uyghur seem to need a new
    >> code point.
    > Note that no one has suggested using U+06BE for Kurdish. The Uighur h is an
    > entirely different matter, and U+06BE is best for it.

    As explained in my previous letter I disagree.
    But today I am mostly interested in Kurdish.

    > In Kurdish, one would use U+0647 for h, and U+06DE for the ə.

    I think you meant U+06D5 (for Kurdish E).
    Yes, I see no objections against that.

    On the other hand, U+0647 for Kurdish H is impossible
    without language tagging, in precisely the same way that
    U+0647 is impossible for Uyghur H.

    >> Position: Isol Init Med Final
    >> Kurdish H: init init med init
    >> Uyghur H: init init med med
    >> and
    >> Urdu: isol init init isol
    >> If the Urdu behaviour is called U+06be, then that is no good
    >> for Kurdish and Uyghur. Not only are the shapes rather different,
    >> but the distribution of the two shapes over the four positions
    >> differs.
    > The reason you see something that looks like a medial form in a final
    > position in some Uighur publications is that a medial glyph form of U+0647
    > has been used, often with the left connecting stroke filed or cut off
    > (we're talking about metal and phototype printing). If you look at Uighur
    > manuscripts, the shaping for h is pretty much identical to the Urdu U+06BE,
    > and this is reflected in better publications and more recent fonts.

    You talk about the shape, and maybe misunderstand me - I talked
    about the shaping, the rules that select the glyph
    given the contextual position.
    Kurdish H has two forms, one is used in medial position, the other
    in all other positions.
    Uyghur H has two forms, one is used in medial and final position,
    the other elsewhere.
    Urdu two-eyed h has two forms, one is used in initial and medial
    position, the other elsewhere.
    Arabic Heh has (at least) four forms.

    You see - an answer that talks about metal and phototype printing,
    about strokes filed off, doesnt cut it. The question is not a
    matter of font, of the precise shape that is used, but a matter
    of shaping - what rules are being used to choose the different
    glyphs for different positions.

    If you agree that I represent the glyph-choosing rules correctly,
    then you must conclude that U+06BE is incorrect for Uyghur,
    and certainly that U+0647 is totally incorrect for Kurdish.
    How is a renderer to know that U+0647 must not be rendered in the
    shape of U+06D5?


    This archive was generated by hypermail 2.1.5 : Tue Aug 29 2006 - 13:54:22 CDT