"Leif H Silli" <xn--mlform-iua_at_xn--mlform-iua.no> wrote:
|We now have some data that indicates that what Unicode says about the UTF-8
|BOM is worded in a way that is possible to misunderstand. I support you in
Yeah! Yeah! Yeah!, that is good to read black on #FCFCF9.
|Steven replied:
|
|>>In XML 1.0 the BOM is in fact described as a signature regardless of
|>> which unicode encoding it is used with:
|>>
|>> |http://www.w3.org/TR/xml/#charencoding
|>
|> Yes, simply spoken out and clarified like that, and everybody
|> knows what to deal with.
|>
|> And btw., my local copy of XML 1.1 (Second Edition, thus current)
|> doesn't include this paragraph (in the referenced 4.3.3):
|>
|> |If the replacement text of an external entity is to begin with
|> |the character U+FEFF, and no text declaration is present, then
|> |a Byte Order Mark MUST be present, whether the entity is encoded
|> |in UTF-8 or UTF-16.
|
|I think you must reread. I find the same "signature" sentence in XML 1.1:
|
|http://www.w3.org/TR/xml11/#charencoding
|
|> But i don't see the big picture of all that markup standards, i'm
|> just have them in case my own work raises some questions..
|
|We now have some data that indicates that what Unicode says about the UTF-8
|BOM is worded in a way that is possible to misunderstand. I support you in
|that Unicode should be more explicit about the fact that
|
|* it is neutral about the BOM in UTF-8 (currently it is possible to read it
|as if Unicode advices against the BOM)
|
|* The BOM is a encoding signature - for both UTF-8 and UTF-16.
|--
|leif halvard silli
Received on Mon Jul 30 2012 - 06:00:42 CDT
This archive was generated by hypermail 2.2.0 : Mon Jul 30 2012 - 06:01:12 CDT