Re: character entities in UTF-8 files

From: Eric Muller (
Date: Wed Jul 13 2005 - 10:35:19 CDT

  • Next message: Gregg Reynolds: "Re: character entities in UTF-8 files"

    Elliotte Harold wrote:

    > <![CDATA[foo]]&gt;bar]]> is parsed as the string "foo]]&gt;bar", not
    > "foo]]>bar". There is no way to represent the three character sequence
    > ]]> inside a CDATA section. You have to close the CDATA section, emit
    > a > character, and open a new CDATA section.

    You are right, I was mislead by XML 1.0, third edition
    (<>) section 2.4, end of 3rd

        The right angle bracket (>) /MAY/ be represented using the string
        "|&gt;|", and /MUST/, for compatibility
        <>, be escaped
        using either "|&gt;|" or a character reference when it appears in
        the string "|]]>|" in content, when that string is not marking the
        end of a CDATA section

    which could be improved by adding "... and is not in a CDATA section."
    and a sentence like "The string "]]>" cannot appear in a single CDATA
    section ("<![CDATA[...]]>]]><![CDATA[...]]>" is a possible pattern for
    the content "...]]>..." that overcomes this limitation.)"


    This archive was generated by hypermail 2.1.5 : Wed Jul 13 2005 - 10:37:02 CDT