Re: terminology: plaintext (was Re: unicode Digest V5 #149)

From: François Yergeau (
Date: Fri Jun 24 2005 - 16:58:45 CDT

  • Next message: Richard Wordingham: "Re: Tamil Collation vs Transliteration/Transcription Enc"

    Asmus Freytag a écrit :
    > HTML is a representation of rich text expressed in a plain text format.

    I think this continues the confusion. I'd rather say that "HTML is a
    representation of rich text expressed in a text format." (not a *plain*
    text format).

    It is not plain precisely because some of the text is designated (by the
    HTML spec) to be interpreted as markup.

    > When you view and edit HTML source, you are accessing it as plain text.

    Correct. Your tool (plain text editor) doesn't know anything about HTML
    and interprets all text as the the only thing it knows about: plain text.

    > The point is that the HTML source is not the same as the HTML text,
    > even though there are related (by the HTML protocol).

    Another point is that the HTML source is also text (but not plain),
    contrary to binary rich text formats.

    > Syntax coloring and content driven styles are even more of a red-herring
    > in this context.

    If your editor colorizes HTML source, it's because it knows something
    about it above the level of plain text, i.e. it considers it rich text
    and enriches the presentation accordingly.


    This archive was generated by hypermail 2.1.5 : Fri Jun 24 2005 - 16:59:45 CDT