Re: unicode Digest V5 #149

From: James Kass (
Date: Sat Jun 18 2005 - 21:33:21 CDT

  • Next message: Gregg Reynolds: "Re: unicode Digest V5 #149"

    Gregg Reynolds writes that talking about whether or not any
    representation of plain text is plain or not is plain ridiculous.

    Agreed. It's pointless to talk about it when all one needs to do is
    to compare the representation of text against the Unicode glossary
    definitions of "plain text" and "rich text".

    > You cannot possibly have the slightest
    > idea whether or not a text represented using "bold, italic, underlining,
    > changes in front size and style etc." is originally plain text or not.

    The easiest way to find out is to open the original file
    in any plain text editor. Even without that simple test,
    knowing about the application which is displaying the
    text offers valuable determinative pointers. For example,
    many C editors apply a higher level protocol to the plain
    text source files in order to present text in various colours
    or styles. That representation doesn't alter the fact that
    original source files are plain text.

    By definition, a C source file is plain text. By definition,
    the representation of a plain text C source file in any editor
    which applies a higher level protocol to the text before
    displaying it is rich text.

    Quoting from the Unicode glossary:

         "Plain Text. Computer-encoded text that consists
    only of a sequence of code points from a given
    standard, with no other formatting or structural
    information. Plain text interchange is commonly
    used between computer systems that do not share
    higher-level protocols. (See also rich text.)"

         "Rich Text. Also known as styled text. The result
    of adding information to plain text. Examples
    of information that can be added include font
    data, color, formatting information, phonetic
    annotations, interlinear text, and so on. The
    Unicode Standard does not address the
    representation of rich text. It is expected that
    systems and applications will implement
    proprietary forms of rich text. Some public
    forms of rich text are available (for example,
    ODA, HTML, and SGML). When everything
    except primary content is removed from
    rich text, only plain text should remain."

    > Representation of text is not text.

    Text which is represented as text is text.

    Best regards,

    James Kass

    This archive was generated by hypermail 2.1.5 : Sat Jun 18 2005 - 21:34:58 CDT