From: Richard T. Gillam (rgillam@las-inc.com)
Date: Wed Mar 09 2005 - 15:41:16 CST
>Anyone who does digitizing of epigraphic sources is familiar with the
various text markup schemes; but no markup
>addresses plain text integrity.
You haven't demonstrated why plain text integrity is a requirement.
You've demonstrated why XML is inconvenient. Not only is inconvenient
not the same thing as unusable, but XML isn't necessarily your only
choice here (although it might be that the established standards for
this kind of thing all use XML). There are lots of methods of carrying
around out-of-band styling information and metadata.
Leaving aside for a moment that XML can be a pain in the butt, why is
plain text a requirement? Are there environments or applications you
have to use that can only handle plain text? Are you worried about
out-of-band styling information (or markup) being lost in transmission,
and if so, why?
It's hard to imagine this kind of thing being implemented in
general-purpose text editors-- I don't think you'll ever see this
functionality in Notepad, for example. If you're going to have to have
special-purpose applications for handling this sort of display anyway,
why is it bad for these applications' file formats to be styled text?
It's not enough to claim that damaged-character annotations are "an
integral part of the text" and "important information is lost" if
they're removed by some process. One could make the same arguments
about headings and emphasized words in normal modern-language text--
after all, losing track of which words in a sentence are emphasized can
change its meaning. That wasn't enough to force encoding of characters
to indicate emphasis. Why is your situation different?
My second question has to do with the notational system you're
describing. How widespread is it? Are lots of people representing
ancient documents this way in printed material? If so, how are they
doing it now? If not, are there lots of people who want to do it this
way, or are they using other methods they're happy with? Seems like
you'd be on much firmer ground if you were proposing a system of visible
symbols for indicating damage and uncertain readings rather than
invisible formatting characters that cause things to be grayed out. But
that'd beg the question of why none of the zillions of existing symbols
in Unicode would work for this purpose.
--Rich Gillam
Language Analysis Systems, Inc.
This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 15:41:50 CST