Re: kurdish sorani

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Aug 31 2006 - 02:18:07 CDT

  • Next message: John Hudson: "Re: kurdish sorani"

    ----- Original Message -----
    From: "John Hudson" <john@tiro.ca>
    Cc: <unicode@unicode.org>
    Sent: Thursday, August 31, 2006 7:03 AM
    Subject: Re: kurdish sorani

    > Philippe Verdy wrote:
    >
    >> Where do we have a reference in OpenType to describe correctly the correct shaping behavior for various languages using the Arabic script? It looks like U+06BE was not covered in these specs, so this may explain why it was forgotten; and anyway, there's certainly a lack of agreement and specification for any other language than Arabic.
    >>
    >> All this suggests that the OpenType working group should really work on this script and start immediately a survey about the effective coverage of all languages using that script. It is clear from those discussions, that only the Arabic language has been seriously considered, and I fear that we learn that other issues are still not detected, even for languages that we know are (or were) written with the Arabic script (and there are many...).
    >
    > There is no 'OpenType working group' responsible for such things. The OpenType font format
    > is becoming part of the MPEG standard, which I presume involves some kind of working
    > group, but that is specifically the font file format (itself an extension of the TrueType
    > sfnt format)

    But there's someone maintaining the OpenType specs (hosted by Microsoft, but is it really leading the project, or just aggregating the documentation from various vendors?). Well the MPEG working group has so much work to do with lots of proprietary standards and licencing issues... Creating a standard from it is a really difficult task, given the limitations that apply to all those attempting to implement it. If the solution must pass with MPEG compliance, I fear lots of interoperability problems, and the need to encode text semantics independantly of proprietary standards will be forgotten, and we'll return to the dark age of ambiguities and lack of interoperability.

    Even if this is informal, such working group does exist, and it appears that those participating in OpenType are also members of the UTC. So their work are related to Unicode recommandations for supporting more languages with the existing unified scripts, and the best they can do is to explicit what they have done to support some languages, and study how the current solutions break with some other languages.

    When unification of scripts causes semantic ambiguities, a solution (new characters, new recommanded sequences of existing characters) will need to be documented by Unicode, considering also the required work in implementations (notably for renderers when shaping is implemented in code with some common tables implemented there instead of being imported in each font design, and for font designers if this renderer code assumes the presence of some tuning parameters in specific OTL tables embedded in fonts).

    Those working on those OTL tables need to think about what is customizable in fonts, and what is part of the renderer and consider the tradeoffs (notably if the renderer ignores some font-specific tables and only considers its own internal tables implemented in the renderer code).

    > and it should be noted that OpenType *Layout* is an optional aspect of
    > OpenType (all the OTL tables are desgination optional; a conformant OT font might not have
    > any OTL tables at all).

    I am not much concerned about optional OpenType Layout tables, especially those that are part of "language system"-specific conditional features. This is, I think, just a way to hack (temporarily) fonts so that they will work in some limited environments or applications, but these conditional features are not the long-term solution.

    There's a clear need for encoding the semantic in the plain-text itself, and if this requires adding more characters or format controls in the stream, this must be documented by Unicode, and then solutions deployed in a recommandation for font vendors, so that these will be part of the default profile of fonts (even if there remains "language system"-specific features, to disable some parts of these additions.

    Nobody can be satisfied if each vendor adopts its own conventions there, because this is acting exactly like font hacks where an encoding is abused only for temporary convenience and usability by a limited audience. Interoperability must be the focus and solutions must be developed so that they will work in plain-text, without explicit language tagging, and in multilanguage documents.

    > There is no body responsible for researching and specifying
    > language-specific font behaviour.

    I did not ask about it. I was speaking about the generic (default) language system implemented in compliant fonts. What an OT font will include in optional features for specific language systems (or even for specific renderers and technologies, such as those made with Graphite, AAT, Volt, ...) is not important for me.

    So yes I am discussing about the 'default' layout that fonts should implement, and that must be adapted by each technology (mostly: Microsoft, Apple, Adobe, MonoType, Xerox, ...) in their specific OTL feature tables: despite the difference of technologies, they must produce consistent results for a common set of features in the default profile supporting each script.



    This archive was generated by hypermail 2.1.5 : Thu Aug 31 2006 - 02:37:06 CDT