RE: Encoded rendering instructions (was Unicode's Mandate)

From: Richard T. Gillam (rgillam@las-inc.com)
Date: Wed Mar 09 2005 - 15:41:16 CST

  • Next message: Andrew C. West: "RE: Encoded rendering instructions (was Unicode's Mandate)"

    >Anyone who does digitizing of epigraphic sources is familiar with the
    various text markup schemes; but no markup
    >addresses plain text integrity.

    You haven't demonstrated why plain text integrity is a requirement.
    You've demonstrated why XML is inconvenient. Not only is inconvenient
    not the same thing as unusable, but XML isn't necessarily your only
    choice here (although it might be that the established standards for
    this kind of thing all use XML). There are lots of methods of carrying
    around out-of-band styling information and metadata.

    Leaving aside for a moment that XML can be a pain in the butt, why is
    plain text a requirement? Are there environments or applications you
    have to use that can only handle plain text? Are you worried about
    out-of-band styling information (or markup) being lost in transmission,
    and if so, why?

    It's hard to imagine this kind of thing being implemented in
    general-purpose text editors-- I don't think you'll ever see this
    functionality in Notepad, for example. If you're going to have to have
    special-purpose applications for handling this sort of display anyway,
    why is it bad for these applications' file formats to be styled text?

    It's not enough to claim that damaged-character annotations are "an
    integral part of the text" and "important information is lost" if
    they're removed by some process. One could make the same arguments
    about headings and emphasized words in normal modern-language text--
    after all, losing track of which words in a sentence are emphasized can
    change its meaning. That wasn't enough to force encoding of characters
    to indicate emphasis. Why is your situation different?

    My second question has to do with the notational system you're
    describing. How widespread is it? Are lots of people representing
    ancient documents this way in printed material? If so, how are they
    doing it now? If not, are there lots of people who want to do it this
    way, or are they using other methods they're happy with? Seems like
    you'd be on much firmer ground if you were proposing a system of visible
    symbols for indicating damage and uncertain readings rather than
    invisible formatting characters that cause things to be grayed out. But
    that'd beg the question of why none of the zillions of existing symbols
    in Unicode would work for this purpose.

    --Rich Gillam
      Language Analysis Systems, Inc.



    This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 15:41:50 CST