Re: (Informational only: UTF-8 BOM and the real life)

From: Steven Atreju <snatreju_at_googlemail.com>
Date: Mon, 30 Jul 2012 12:52:13 +0200

"Leif H Silli" <xn--mlform-iua_at_xn--mlform-iua.no> wrote:

|We now have some data that indicates that what Unicode says about the UTF-8
|BOM is worded in a way that is possible to misunderstand. I support you in

Yeah! Yeah! Yeah!, that is good to read black on #FCFCF9.

|Steven replied:
|
|>>In XML 1.0 the BOM is in fact described as a signature regardless of
|>> which unicode encoding it is used with:
|>>
|>> |http://www.w3.org/TR/xml/#charencoding
|>
|> Yes, simply spoken out and clarified like that, and everybody
|> knows what to deal with.
|>
|> And btw., my local copy of XML 1.1 (Second Edition, thus current)
|> doesn't include this paragraph (in the referenced 4.3.3):
|>
|> |If the replacement text of an external entity is to begin with
|> |the character U+FEFF, and no text declaration is present, then
|> |a Byte Order Mark MUST be present, whether the entity is encoded
|> |in UTF-8 or UTF-16.
|
|I think you must reread. I find the same "signature" sentence in XML 1.1:
|
|http://www.w3.org/TR/xml11/#charencoding
|
|> But i don't see the big picture of all that markup standards, i'm
|> just have them in case my own work raises some questions..
|
|We now have some data that indicates that what Unicode says about the UTF-8
|BOM is worded in a way that is possible to misunderstand. I support you in
|that Unicode should be more explicit about the fact that
|
|* it is neutral about the BOM in UTF-8 (currently it is possible to read it
|as if Unicode advices against the BOM)
|
|* The BOM is a encoding signature - for both UTF-8 and UTF-16.
|--
|leif halvard silli
Received on Mon Jul 30 2012 - 06:00:42 CDT

This message: [ Message body ]
Next message: Steven Atreju: "Re: (Informational only: UTF-8 BOM and the real life)"
Previous message: Richard Wordingham: "Re: A Potentially Useful Property - Last Informative Proposal"
Maybe in reply to: Steven Atreju: "(Informational only: UTF-8 BOM and the real life)"

Mail actions: [ respond to this message ] [ mail a new topic ]
Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

This archive was generated by hypermail 2.2.0 : Mon Jul 30 2012 - 06:01:12 CDT