Re: character entities in UTF-8 files

From: Chris Jacobs (chris.jacobs@freeler.nl)
Date: Tue Jul 12 2005 - 18:28:59 CDT

  • Next message: Gregg Reynolds: "Re: character entities in UTF-8 files"

    ----- Original Message -----
    From: "Peter Constable" <petercon@microsoft.com>
    To: <unicode@unicode.org>
    Sent: Tuesday, July 12, 2005 11:03 PM
    Subject: RE: character entities in UTF-8 files

    > > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    > > On Behalf Of Chris Jacobs
    >
    > > > We have an XML based application...
    >
    > > Only it does not stand for e acute, as far as unicode is involved it
    > > just stands for itself, for &#233;.
    > >
    > > Of course you are allowed to have agreements with your users about
    > > replacing &#233; by e acute or by whatever you want to replace it by.
    >
    > Since this is an XML application, then at the level of XML parsing,
    > &#233 must be interpreted as e-acute; he is not allowed to have
    > agreements with his users about replacing &#233 with anything else.

    Except that not: specifies UTF-8 files as source, but: "specifies UTF-8
    files as input".
    So this &#233; is not in the XML source, but in the input which the XML
    reads.
    The &#233 will then not be parsed as XML, just like when you write in BASIC
    a text editing program the edited text will not be scanned for BASIC key
    words unless you for whatever reason program it to do so.



    This archive was generated by hypermail 2.1.5 : Tue Jul 12 2005 - 18:33:27 CDT