Re: minimizing size (was Re: allocation of Georgian letters)

From: Eric Muller (
Date: Sat Feb 09 2008 - 19:26:03 CST

  • Next message: Eric Muller: "Re: minimizing size (was Re: allocation of Georgian letters)"

    James Kass wrote:
    > Now, I don't know where those extra spaces are coming from, but I bet
    > they make searching difficult.

    Short answer:

    Acrobat (Pro and Reader) is attempting to reconstruct correctly the text
    even in adversarial conditions. The spaces are the result of attempts at
    obtaining the best results across a wide range of PDF documents.

    Slightly longer answer:

    In many cases, PDF generation is hooked at fairly late stage of the
    pipeline that goes from the user input to a printed image. For an input
    like "the car" you can end up with PDF content of the form (using a
    pseudo notation):

    (the car) showstring


    (the) showstring 50 advance (car) showstring

    To accommodate the later case, Acrobat needs to generate a space
    character when there is no space glyph. Because there are many
    complications of the same nature, the conditions under which to generate
    a space character are non trivial, and most likely involve some
    compromises. Furthermore, it is quite likely that the class of PDFs
    corresponding to Indic texts was not considered when determining those

    May be the conditions which are actually coded in Acrobat can be refined
    to work better for Indic texts, may be there are inherent conflicts with
    other PDFs (I just don't know).


    This archive was generated by hypermail 2.1.5 : Sat Feb 09 2008 - 19:29:04 CST