Re: kurdish sorani

From: Philippe Verdy (
Date: Wed Aug 30 2006 - 22:56:25 CDT

  • Next message: John Hudson: "Re: kurdish sorani"

    From: "John Hudson" <>
    > Andries Brouwer wrote:
    > This is my point: the normative shaping is not language-specific but style-specific. In
    > the naskh style, the character encoded as U+06BE has two forms. In the nasta'liq style it
    > has one form.

    And the Nasta'liq style is extremely oriented to one specific language where the needed distinction is not necessary. Your opinion is only true for Urdu and doesn't even make sense with the Arabic language. What is tolerated in one language should not be assumed to be tolerated in other languages only because it uses the same Unicode script.

    In reality, the Unicode Arabic script is the result of the unification of similar but still distinct scripts with their own rules coming from the languages that use each of them.

    What you're saying is excatly like saying that accents on European languages sharing the same Unicode "Latin" script are just a matter of style: of course you can design a font style that does not display any accent, but then you'll introduce ambiguities in the language orthographies.

    When a language makes a semantic distinction, it's important for Unicode to allow encoding them clearly, independantly of whever the font styles will implement them or not. It's not up to Unicode to dictate the fonts to use. And given the current issues, there's for now no better solution than using language tagging in texts and hoping that fonts and renderers will support it, and hoping that multilanguage documents will be able to carry the language distinctions in some high-level protocol.

    It has been demonstrated here that there were true semantic distinctions between the various letter forms admitted for the Arabic language. So a clean solution must be developed to allow encoding them independantly of the font style, and even independantly of possible language tagging which it is not always possible to encode in a higher-level protocol (especially in plain-text files).

    This archive was generated by hypermail 2.1.5 : Wed Aug 30 2006 - 22:58:08 CDT