Re: Does OpenOffice 3.0 handle unicode?

From: Adam Twardoch (list.adam@twardoch.com)
Date: Mon Mar 23 2009 - 15:57:51 CST

  • Next message: Sarasvati: "Re: Does OpenOffice 3.0 handle unicode?"

    Dear list subscribers,

    please apologize that I have actually taken (or perhaps I should say
    wasted) my time and actually written a reply (below) to Philippe's
    trolling message. He threw in so many irrelevant, misleading and wrong
    assertions that I felt compelled to reply after all.

    Philippe,

    > Absolutely not; PostScript does not even use Unicode (it was created before
    > Unicode ever existed), but specific mappings to some proprietary Adobe
    > encodings, or to a few legacy 8-bit encodings, or mappings by glyph name.
    > PostScript does not support the character model, it is fully based on the
    > glyph model.
    >
    > The standard Poscript operator for rendering text (show) does not even
    > support the decoding of strings according to a character encoding model, so
    > it cannot support necodings to more than 256 glyphs. For larger fonts with
    > many more glyphs, applications (or printer drivers) need to define their own
    > rendering routine in the Postscript langiage itself (within a header before
    > the document itself).

    Irrelevant. That is all true, but this doesn't have anything to do with
    OpenType, really. The way PostScript, the page description language,
    works, is the same independent of what sorts of fonts one uses
    (TrueType, Type 1, CID-keyed PostScript, OpenType TT, OpenType PS). Sure
    PostScript does not support Unicode, but this is completely irrelevant,
    since we are not talking about PostScript here but about OpenOffice.

    OpenOffice is a desktop application that runs on a number of operating
    system, all of them having some built-in font support, some built-in
    character encoding support (Unicode in most cases), and some built-in
    character-to-glyph shaping and rendering mechanism (typically based on
    OpenType Layout or in case of Mac OS X, also AAT).

    > For complex scripts, OpenType PS fonts use a Adobe-specific CID encoding but
    > this is still an encoding for glyphs, not for characters.

    Wrong. CID-keyed OpenType PS fonts are used exclusively for CJK fonts
    (Chinese, Japanese, Korean), which are not consider complex scripts.
    Complex scripts include Arabic, Syriac, Thai, Devanagari, Gujarati,
    Malayalam etc.

    > There may exist
    > some mapping tables in OpenType PS fonts for legacy MacOS encodings (that
    > have a one-to-one match between a character code and a glyph), but for any
    > other encoding using the character model is only supported by the OS or by
    > the rendering engine used by the application.

    I don't understand this sentence.

    > Note that applications using PostScript fonts need to understand the
    > specific format of font metrics, because they are not stored in the font
    > itself.

    I gather that by "PostScript font" you mean "Type 1 font". I never
    mentioned Type 1 in my message, as it is a different format, completely
    unrelated to the issue at hand, and indeed, not compatible with Unicode
    (this is the reason why Adobe delared it obsolete ten years ago).

    But, if you do mean Type 1, you're wrong anyway. Type 1 font metrics are
    "stored in the font itself". Type 1 fonts are stored in several files,
    and the glyph description data is indeed stored in a different file than
    the metrics, but that's just a technical issue. Modern operating systems
    offer built-in APIs that expose both glyph rendering and metric-based
    layout for Type 1 fonts.

    > However OpenType fonts are built using a postscript font and
    > compiling the separate tables for font metrics into a single file; however
    > these metrics use a different format and different conventions than
    > TrueType-based OpenType fonts.

    Of course parts of PostScript-flavored OpenType fonts are different from
    parts of TrueType-flavored OpenType fonts, while some parts are
    identical in both flavors. That is the whole concept of the "flavors",
    or subformat, within the OpenType format. It's no different than
    different types of compression used in PDF or TIFF files. Either way,
    FreeType 2 since its inception, and the APIs of the other operating
    systems (Mac OS X, Windows since version 2000) offer native support for
    glyph rendering and metric-based layout for both flavors. (For Windows
    9x, Windows NT and Mac OS 9, installation of Adobe Type Manager was
    necessary to get support for PostScript-flavored OpenType fonts, and
    that support was indeed quite limited).

    > Finally, the main difference between TrueType and Postscript font flavors is
    > that they are not defined the same way: the former use conic Beziers (with
    > one control point between two points on curve) only, the later use quadratic
    > Bezier curves (with two control points between two points on curve);

    Irrelevant. PDFs consists of both vectors and bitmaps. So what?

    > coordinate system for defineing the glyph's "quad" is a bit different also
    > (1024×1024 for TT, 1000×1000 for PS)

    Wrong (but this is a very common misconception). Both TrueType-flavored
    and PostScript-flavored fonts can use the em square of anywhere between
    1 and 32,767. Conventionally, most PostScript-flavored OpenType fonts,
    and the majority of TrueType-flavored OpenType fonts, are based on the
    em square of 1000. A number of TrueType-flavored OT fonts and some
    PostScript-flavored OT fonts are based on the em square of 2048. There
    are a few Asian fonts that use considerably smaller em square
    resolutions (e.g. 512 or 300), and there may be some that use 1024
    (though I don't remember seeing one). But there are also both
    PostScript-flavored OT fonts and TrueType-flavored OT fonts that don't
    follow the convention and use completely different em squares such as
    400, 27, 3000, 16000, and others (usually for very good reasons).

    > The encoding mapping tables are
    > completely different in OpenType

    The commonly-found formats of the "cmap" table in OpenType fonts are
    identical regardless of the font's flavor.

    > the glyph composition rules have some
    > common parts, but there are tables that make senses only for Macs and PS
    > engines, and other only for PCs and TT engines.

    I'm not sure what you mean by "Macs and PS engines" and "PCs and TT
    engines". PostScript-flavored OT fonts are supported on both Mac OS X
    and Windows. TrueType-flavored OT fonts are supported on both Mac OS X
    and Windows. Of course parts differ between flavors but well, that is
    the whole point of the flavors.

    > That's not true. Fonts are made either for Mac (where they are preferably
    > defined in OpenType PS type) or PC (where they are most often in TrueType
    > format or OpenType with trueType flavor.

    I've been working for MyFonts, one of the major online font
    distributors, continuously since 2000. I've been working as a consultant
    for other major font distributors such as Linotype and Monotype. And
    I've been in regular contact with both large font distributors
    (FontShop) as well as smaller ones. And I've been working for FontLab,
    the primary maker of applications used by people who create fonts, since
    2004. I made my living on watching the font development market. I
    regularly talk to hundreds (literally) of type designers every year. And
    I've actually looked at the relevant data.

    > It is not severe on Macs, since MacOSX has also adaopted the support for
    > TrueType-based fonts since long (which are less complex to define than
    > PostSCroipt fonts that are fully based on proprietary standards and
    > proprietary mappings and do not support the character model).

    The Type 1 font format specification has been published by Adobe
    *nineteen* years ago, with all the subsequent extensions and
    supplemental information published throughout the 1990s.

    Mac OS X did not "adopt" the support for TrueType-based fonts. The
    TrueType font format was actually developed *by Apple* and premiered in
    1991 on both Mac OS 7 and Windows 3.1 (Microsoft licensed TrueType from
    Apple).

    > There are still typographers that consider (...)

    Whatever. Irrelevant. There may be people who argue that PNG is superior
    to JPEG or the other way around, but that's irrelevant. The relevant
    thing is that an web browser needs to support both, or it's -- from the
    user's point of view -- broken.

    > To work with both flavors of fonts, applications must be prepared to use a
    > common model

    No, they "don't must". Applications may choose to use a common model
    (use the OS APIs on Mac OS X and Windows, use FreeType 2 on Linux), or
    they may choose to implement the font support themselves (what Adobe
    does in their applications). Either way -- they should do it because
    OpenType is the dominant font format today (and there are more new
    OpenType fonts released in the PostScript flavor than in the TrueType
    flavor).

    > and use a layout engine that can correct map charactes to
    > glyphs using the informations found in different tables of the fonts

    The relevant tables that map characters to glyphs (GSUB, GPOS, GDEF,
    cmap) are the same for both flavors.

    > Windows has such an API, but it is specific to Windows and may not be
    > portable to other systems, so it may not be used by OpenOffice.org that uses
    > its own layout engine.

    As far as I know, OpenOffice on Windows actually *uses* the system API
    (Uniscribe) for complex script shaping. On Linux, OpenOffice does not
    use "its own layout engine". It uses Pango and FreeType, which both
    support both OpenType flavors.

    > In addition, applications need to buy a licence to be able to
    > use PostScript Type 1 fonts made by Adobe (or by a few wellknown typography
    > providers).

    First, I was not talking about Type 1 at all. Second, what are you
    talking about? Opensource code that applications can use to display Type
    1 fonts has been available for more than a decade. Rainer Menzner wrote
    the first version of t1lib in 1997, around the same time as FreeType was
    created. Applications do not need to buy any licences to be able to use
    Type 1 fonts (or TrueType, or OpenType fonts) made by Adobe or anyone
    else. (The only issue I know of are limitations on TrueType hinting set
    by patents held by Apple).

    Of course if applications what to *bundle* any code (including fonts),
    that code must be properly licensed. But this has nothing to do with
    Adobe or Type 1 fonts, specifically. This applies to Type 1 fonts,
    TrueType fonts, OpenType fonts, C++ software code, Python software code
    and anything else, whether it was made by Adobe, Microsoft, Apple,
    Linotype, Google, Sun, IBM or a small company around the corner.

    It is covered by copyright, and unless a license specifically allows
    application developers to bundle code without paying for a license, it
    is assumed that licenses need to be paid for. What has this to do with
    anything discussed here?

    > There are other things that are supported differently on Macs an PCs, but
    > this is for legal reasons: font hinting technology is not easily portable
    > from one OS to the other in products that must be freely distributed,

    Those limitations apply to TrueType hinting. What has this to do with
    OpenOffice supporting PostScript-flavored OpenType fonts?

    > I have not seen such legal disclaimer in Pango (or when installing
    > OpenOffice), so I'm not sure that it can even legally make use of those
    > proprietary extensions covered by costly patents (that are not royaltee-free
    > for everyone).

    This is irrelevant. OpenOffice already uses Pango for complex-script
    shaping in TrueType-flavored OT fonts, but it just doesn't display
    PostScript-flavored OT fonts.

    Short: I'm grateful to James Cloos for pointing out that
    PostScript-flavored OpenType fonts will work in OpenOffice 3.2. About
    time :)

    A.

    -- 
    Adam Twardoch
    | Language Typography Unicode Fonts OpenType
    | twardoch.com | silesian.com | fontlab.net
    I hate to advocate drugs, alcohol, violence, or
    insanity to anyone, but they've always worked for me.
    (Hunter S. Thompson)
    


    This archive was generated by hypermail 2.1.5 : Mon Mar 23 2009 - 16:01:26 CST