RE: Encoded rendering instructions (was Unicode's Mandate)

From: James Kass (
Date: Wed Mar 09 2005 - 22:23:29 CST

  • Next message: Asmus Freytag: "Re: Encoded rendering instructions (was Unicode's Mandate)"

    We have precedent for preserving information about damaged glyphs
    in computer plain text.

    As Bob Richmond already pointed out, damaged glyph characters were
    part of Michael Everson's N1944 proposal for Egyptian Hieroglyphic

    Quoting from N1944, "Additionally, 40 Alternate Format characters
    will be needed in order to map currently-encoded texts to the UCS."

    According to Bob Richmond, damaged glyph characters are included in
    the private use spec at:

    "Manuel de Codage"
    A standard system for the computer-encoding of Egyptian
    transliteration and hieroglyphic texts
    by Hans van den Berg

    Please refer to the section on shading.

    Note that this system uses ASCII characters to encode hieroglyphic text,
    but this shouldn't suggest that Unicode would recommend that hieroglyphics
    be encoded using either ASCII or a higher-level protocol.

    "RES" was offered to replace the Manuel de Codage, but is more of
    a mark-up scheme than a computer encoding:

    Damaged glyphs are shown in running heiroglyphic text. Here's
    some handy examples, all in PDF format, from this page:

    Breasted (1906) uses dashes and brackets to denote missing or partially
    obliterated material in transcription.

    [ Those with good connections may wish to check out the PDF version
    by Prof. Breasted (The University of Chicago Press, 1906) linked on this page: ]

    Note how Brian Colless uses ASCII brackets to denote damaged glyphs in
    plain text in this letter concerning the decipherment of the Byblos
    Syllabary on this page:

    Lloyd Anderson made a survey of brackets, dots, and hashing used in
    various studies back in 1998.

    The survey request is here:
    ( )

    I was unable to find the results of that survey on-line, though.

    In David Stuart's transcription of a Mayan inscription from Pelanque at:
    ... an ASCII question mark is used to denote an indecipherable glyph and
    brackets are used to denote a reconstructed glyph.

    Plain text is considered useful!

    Quoting from this page,

         "Why bother with plain text? The benefits are insurance
         against obsolescence, leverage, and easier testing.
         Human-readable forms of data, and self-describing data,
         will outlive all other forms of data and the applications
         that created them. Period. As long as the data survives,
         you'll be able to use it long after the original application
         is defunct. "


         "Nearly every tool in the computing universe, from
         source code management systems to compiler
         environments to editors and standalone filters, can
         operate in plain text. So if you need to ensure that
         all parties can communicate using a common standard,
         use plain text. "

    For my two cents worth, Dean A. Snyder's suggestion for handling
    damaged glyphs *at the character level* has merit.

    Best regards,

    James Kass

    This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 22:24:35 CST