Re: Accessing alternate glyphs from plain text

From: Leonardo Boiko (leoboiko@gmail.com)
Date: Tue Aug 10 2010 - 13:05:36 CDT

  • Next message: CE Whitehead: "RE: Accessing alternate glyphs from plain text"

    On Tue, Aug 10, 2010 at 13:15, Doug Ewell <doug@ewellic.org> wrote:
    >  Your handwritten A and mine may look different, and both may differ from a
    > typewritten A, but they have something in common that allows us to identify
    > them with each other.

    I have problems with this argument too. For example, consider the
    following text:

    YOURHANDWRITTENAANDMINEMAYLOOKDIFFERENTANDBOTHM
    AYDIFFERFROMATYPEWRITTENABUTTHEYHAVESOMETHINGIN
    COMMONTHATALLOWSUSTOIDENTIFYTHEMWITHEACHOTHER.

    This is written in a similar manner as texts were written in the past,
    before spacing, punctuation and lowercase came into being. Now it
    certainly has “something in common that allows us to identify” it with
    your original text. E.g., for most uses (but not all), we don’t mind
    adding modern punctuation and casing to ancient texts and saying it’s
    the “same” text. Nonetheless, by transforming your text I clearly
    lost some information. We don’t want to remove spacing and
    punctuation from plain text, even though the historic examples show
    that they’re not “strictly necessary”. (As you know, our plain text
    can even mark _different_ kinds of spacing, as you’re seeing if you’re
    reading this plain-text sentence in a variable-width font.)

    There’s some information lost when we render our “plain text” as
    ancient text. Similarly, there’s some information lost when we render
    handwritten text, typeset text, or computer “rich text” to plain text.
     It seems to me these two losses are different only in degree, not in
    kind.

    To run with your example, my handwriting certainly can go well beyond
    just “looking different” than a typewriter; it can actually encode
    significant linguistic information that the typewriter cannot. I have
    a letter whose author, in a moment of emotional distress, wrote the
    sentence “to hurt myself” several times, and in each time the words
    get larger and more slanted, with more irregular forms. This graphic
    resource is a representation of features of speak intensity, speed,
    intonation &c., which is to say, it has pretty much the same role as
    punctuation. If you encode her text in plain text, and even in rich
    text, you lose this linguistic information. The only way to keep
    something I’m willing to call “the same text”, in this case, would be
    an image.

    It’s all a matter of intended use.

    > The whole premise of reading and writing is that we
    > look below the surface to the identity of the letters and the meaning of the
    > words.

    No, the whole premise of reading and writing is to represent language,
    which is spoken, in a visual manner. Nothing to do with letters;
    letters are just tools for representing language. You cannot read
    without re-creating sound images in your head. Only after the sound
    image is recreated is that you reach the “meaning” (even, contrary to
    popular myth, in the case of so-called “ideographs”). Plain text can
    encode some features of the spoken language, but (obviously) not all.
    Some of the features left out might be considered important for some
    texts, in some uses. Nietzsche prose employs a lot of italics (which
    are typographic marks of something like emphatic stress in speak); if
    you take away the italics, the resulting text simply isn’t “the same”
    —everyone who uses Nietzsche texts (philosophy students, &c.) is
    interested in keeping the italics.

    The question here is what’s the cutoff point; where do we draw the
    line about what information goes into plain text, and why. In my
    humble opinion there seems to be no clear “why”; the line seems an
    entirely arbitrary technological artifact, a remnant of intuitions
    developed due to limitations of the typewriter, the teletypes, and
    early tty-style computer terminals. This is not a bad thing. I’m not
    dissing plain-text or saying we should abolish it or encode italics or
    anything like that. But by the same token I don’t consider it some
    special, unique representation of “true meaning”. Plain text is to me
    simply yet another attempt to represent language, and like all similar
    tools, has its strengths and weaknesses—in particular, like all
    language representation tools, it can encode some kinds of “meanings”
    and not others.

    -- 
    Leonardo Boiko
    


    This archive was generated by hypermail 2.1.5 : Tue Aug 10 2010 - 13:09:02 CDT