RE: Does OpenOffice 3.0 handle unicode?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Mar 23 2009 - 11:55:11 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Does OpenOffice 3.0 handle unicode?"

    Adam Twardoch
    > Envoyé : lundi 23 mars 2009 16:12
    > Ā : unicode@unicode.org
    > Objet : Re: Does OpenOffice 3.0 handle unicode?
    >
    >
    > I shall add, however, that OpenOffice 3.0 on my Mac OS X does
    > not at all support PostScript-flavored OpenType fonts. This
    > can hardly be called a system-specific issue because Mac OS X
    > handles (renders) all OpenType fonts natively, even though it
    > (Mac OS X) does not render OpenType Layout for complex
    > scripts. Last time I checked, OpenOffice did not support any
    > OpenType PS (.otf) fonts on Windows either.
    >
    > This is certainly an issue directly related to Unicode
    > support

    Absolutely not; PostScript does not even use Unicode (it was created before
    Unicode ever existed), but specific mappings to some proprietary Adobe
    encodings, or to a few legacy 8-bit encodings, or mappings by glyph name.
    PostScript does not support the character model, it is fully based on the
    glyph model.

    The standard Poscript operator for rendering text (show) does not even
    support the decoding of strings according to a character encoding model, so
    it cannot support necodings to more than 256 glyphs. For larger fonts with
    many more glyphs, applications (or printer drivers) need to define their own
    rendering routine in the Postscript langiage itself (within a header before
    the document itself).

    For complex scripts, OpenType PS fonts use a Adobe-specific CID encoding but
    this is still an encoding for glyphs, not for characters. There may exist
    some mapping tables in OpenType PS fonts for legacy MacOS encodings (that
    have a one-to-one match between a character code and a glyph), but for any
    other encoding using the character model is only supported by the OS or by
    the rendering engine used by the application.

    Note that applications using PostScript fonts need to understand the
    specific format of font metrics, because they are not stored in the font
    itself. However OpenType fonts are built using a postscript font and
    compiling the separate tables for font metrics into a single file; however
    these metrics use a different format and different conventions than
    TrueType-based OpenType fonts.

    Finally, the main difference between TrueType and Postscript font flavors is
    that they are not defined the same way: the former use conic Beziers (with
    one control point between two points on curve) only, the later use quadratic
    Bezier curves (with two control points between two points on curve); the
    coordinate system for defineing the glyph's "quad" is a bit different also
    (1024×1024 for TT, 1000×1000 for PS). The encoding mapping tables are
    completely different in OpenType; the glyph composition rules have some
    common parts, but there are tables that make senses only for Macs and PS
    engines, and other only for PCs and TT engines.

    > since the vast majority of Unicode fonts released
    > these days commercially are OpenType PS fonts.

    That's not true. Fonts are made either for Mac (where they are preferably
    defined in OpenType PS type) or PC (where they are most often in TrueType
    format or OpenType with trueType flavor. The main difference is not the
    Postscript language itself (except for PS Type 1 fonts that are generally
    not hinted but require a true implementation of the PostScript language, but
    can be used for simple fonts with simple mappings).

    So your conclusion is false:
    * it's true that this is not a system-specific issue
    * but you cannot conclude that this is a Unicode issue
    * really this is here an **application-specific** issue that still has no
    support for the correct usage of PostScript fonts.

    It is not severe on Macs, since MacOSX has also adaopted the support for
    TrueType-based fonts since long (which are less complex to define than
    PostSCroipt fonts that are fully based on proprietary standards and
    proprietary mappings and do not support the character model).

    There are still typographers that consider that PS is superior because of
    the use of quadratic Beziers, but this is not really a problem because it
    has been mathematically demonstrated that any quadratic Bezier can be safely
    converted to cubic Beziers without adding lots of control points, with the
    same precision as the one used by PostScript for its curve "flatness"
    parameter (this parameter limits the decomposition of curves into polygones
    made of straight segments until their relative angle becomes nearly flat).
    The conversion to the other flavor of curves is also possible by adding a
    few control points. Note that fonts define their curve flatness parameters
    themselves.

    The conversion of the character's definition grid is also convertible to the
    other model but this is generally not easy if you have used 1000 instead of
    1024 for the quad, because fonts are typically built using integer
    coordinated for their control points, instead of floats. This also affects
    the way font metrics are defined, and their precision: it's not really
    possible to convert those font metrics because you cannot add more control
    points for them to give an equivalent precision.

    To work with both flavors of fonts, applications must be prepared to use a
    common model (like the one used in the conventions defined and used by
    FreeType.org), and use a layout engine that can correct map charactes to
    glyphs using the informations found in different tables of the fonts.
    Windows has such an API, but it is specific to Windows and may not be
    portable to other systems, so it may not be used by OpenOffice.org that uses
    its own layout engine.

    So the main problem for OpenOffice is to understand all the OpenType tables
    and be able to compute string layouts according to different metrics, and
    also to discover how the glyphs are mapped in PS-type OpenType fonts. But
    you cannāt say it is a problem of Unicode: the mapping from Unicode to
    Postscript CID encodings is not defined in the OpenType standard itself, but
    in Adobe proprietary documents (which are subject to licencing
    restrictions). In addition, applications need to buy a licence to be able to
    use PostScript Type 1 fonts made by Adobe (or by a few wellknown typography
    providers).

    Note that if you buy a licence for a font design to a typographer, you may
    have to choose between the PosctSCript and the TrueType based formats. On
    large typography providers (like Monotype), font designs are generally
    available in both formats (you generally have to choose "for PC" or "for
    Mac").

    Given that there are many more PCs than Macs today (and even Macs now
    support TrueType) the TrueType-based fonts have become much more common
    today, except for wellknown legacy fonts (but with high quality, that are
    built by Adobe and available with a paid licence, those fonts that are
    normally part of the "standard" set of PostScript fonts: Courrier, Times,
    Helvetica, Zapf Dingbats...).

    Other free (or proprietary) implementations use other font designs that are
    compatible to the font Panose characteristics and font metrics, and then
    remap these Adobe fonts to the alternate designs. For example Microsoft has
    used its own metric-compatible fonts for a few of these legacy Adobe fonts:
    Courrier New, Times New Roman, Arial, but Microsoft created also its own
    designs, with the help of wellknown typography providers (like Monotype that
    also provided some of their designs to Adobe...) notably Verdana for more
    readable text display on screen (with much less visible placement artefacts
    in WYSIWYG mode as the larger design allows more freedom for font hinting).

    There are other things that are supported differently on Macs an PCs, but
    this is for legal reasons: font hinting technology is not easily portable
    from one OS to the other in products that must be freely distributed, due to
    copyright restrictions and licencing issues in these technologies. (Users of
    FreeType or GhostScript should have all read the legal restrictions and
    limitations about the implementation, and how to remove this limitation by
    buying a licence to either Microsoft, Apple, and/or Adobe; they should also
    know why some "common" font styles are not provided with these
    applications).

    I have not seen such legal disclaimer in Pango (or when installing
    OpenOffice), so I'm not sure that it can even legally make use of those
    proprietary extensions covered by costly patents (that are not royaltee-free
    for everyone). These patents are certainly not a problem of Unicode or
    Unicode-compliance of the application because Unicode/ISO 10646 can be fully
    implemented free of charge in applications like OpenOffice without breaking
    the full compliance to the standard.

    Did you check if those limitations also apply to the "StarOffice" version
    licenced and supported by Sun? Its licence may include the support and
    sublicences for these proprietary extensions restricted by Adobe or
    Microsoft patents (or other exclusive copyrights owned by typographers for
    some font designs that are subject to additional licencing), but I can't say
    if this is effectively the case.

    Finally, have a look at the properties of your fonts: some fonts have
    restrictions in their use (some cannot be used for creating new documents,
    or for embedding, or for printing, or for distributing documents, or for
    embedding, or the glyph definitions are protected and do not allow their
    derivations to create some effects... There are many possible restrictions).
    This could explain why some of these fonts you have tried do not work in
    OpenOffice but may work in MS Word or other programs with a paid licence.



    This archive was generated by hypermail 2.1.5 : Mon Mar 23 2009 - 11:57:26 CST