> 2012-07-13 22:37, David Starner wrote:
>> Wikipedia says "The Unicode standard recommends against the BOM for
>> UTF-8." and refers to page 30 of the Unicode Standard, version 6.0,
>> that says "Use of a BOM is neither required nor recommended for
>> UTF-8..." Calling it a myth seems bizarre.
> “Not recommended” is distinct from “recommends against”.

I disagree; the meaning of the two phrases overlaps in my idolect, and
while it would be somewhat laconic, I might use "not recommended" to
mean "if you insist on doing that, please give us a chance to get the
fire extinguisher first",

> A
> more appropriate formulation would be “Use of a BOM is not required for BOM,
> but may be used as a signature that indicates, with practical certainty,
> that data is UTF-8 encoded.”

In the environment that UTF-8 was developed for, a BOM is a nuisance;
a BOM will stop the shell from properly interpreting a hashbang, and
other existing programs will lose the BOM, duplicate the BOM, and
scatter BOMs throughout files. Given the number of text-like file
formats (like old-school PNM) and number of scripts depending on
existing behavior, these aren't going to be changed.

As I said before, Unicode simplified but did not solve the fact that
text from other operating systems requires some modification before
working just right. But I don't think that Unicode should recommend
unconditionally the UTF-8 BOM, because it is problematic in the field
of use UTF-8 was created for and is still used for.

