Re: Encoded rendering instructions (was Unicode's Mandate)

From: Dean Snyder (
Date: Tue Mar 08 2005 - 22:11:02 CST

  • Next message: John Burger: "Re: Encoded rendering instructions (was Unicode's Mandate)"

    Deborah W. Anderson wrote at 9:32 AM on Tuesday, March 8, 2005:

    >To add a bit of information to Asmus' comment:
    >> What would be a nice first step...would be a serious, coordinated >
    >>effort by leading paleographers to come to an agreement as to
    >> precisely what kind of information needs to be preserved, and for >
    >>what scripts or paleographic sub-discipline it would be sufficient.
    >Already in 1990 the Text Encoding Initiative had defined guidelines on
    >how to mark up texts, particularly for scholarly works. The latest
    >version, P4, provides recommendations on mark-up for damaged text (see
    >A specific set of guidelines ("EpiDoc") based on TEI was developed by
    >epigraphers ( A number of
    >projects with Latin and Greek have implemented these guidelines.

    Two problems in this context with all xml markup:

    1) it's markup, not plain text ;-)

    2) plus all the markup schemes with which I am familiar (including the
    ones you mention here) use non-empty tags for the markup elements, tags
    which are basically worthless for overlapping hierarchies of meta-textual
    data. (Try marking up a textual feature that spans portions of two
    paragraphs and you'll see what I mean.) Of course, that could be remedied
    by the use of empty tags, but we would still be faced with the issue that
    the features needing preservation are lost when we convert to plain text,
    unless their specification is part of the plain text stream.


    Dean A. Snyder

    Assistant Research Scholar
    Manager, Digital Hammurabi Project
    Computer Science Department
    Whiting School of Engineering
    218C New Engineering Building
    3400 North Charles Street
    Johns Hopkins University
    Baltimore, Maryland, USA 21218

    office: 410 516-6850
    cell: 717 817-4897

    This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 10:41:11 CST