Re: character entities in UTF-8 files

From: Andy Heninger (
Date: Thu Jul 14 2005 - 01:13:47 CDT

  • Next message: Donald Z. Osborn: "Questions re ISO-639-1,2,3"

    Gregg Reynolds wrote:

    > an XML parser will
    > first *replace* character entities, before passing the data to the
    > consuming application. When that happens in relation to parsing (i.e.
    > checking for well-formedness) is implementation-dependent,

    It's implementation dependent only because so many implementations get
    it wrong. XML's rules for entity replacement and construction of the
    text to be delivered by a parser to the application are astoundingly,
    mind bogglingly complicated. SGML heritage is largely to blame, I've
    been told.

    for the official story.

    > if I'm not
    > mistaken. I find the XML spec a little fuzzy on that point (I can't
    > wait for the English translation); it talks about at least < and some
    > other char entities being "escaped".

       -- Andy Heninger


    This archive was generated by hypermail 2.1.5 : Thu Jul 14 2005 - 01:16:02 CDT