RE: UTF-8 BOM Nonsense

From: Michael Kaplan (Trigeminal Inc.) (
Date: Fri Jun 23 2000 - 15:52:01 EDT

Yes, I do feel this way, actually. :-)

The standard is quite clear in its language, one does not have to be a
semanticist to understand that:

1) XML *is* considered to be UTF-8 if there is no BOM and UTF-16 if there
2) The encoding tag was added in recongition of the fact that #1 will not be
enough for some people
3) An XML parser *must* be able to do #1, but #2 is not required.

So I would not discourage the tag, ever. But I would also never do XML that
was not UTF-8 or UTF-16. :-)


> ----------
> From: Robert A. Rosenberg[]
> Sent: Friday, June 23, 2000 11:34 AM
> To: Michael Kaplan (Trigeminal Inc.)
> Cc: Unicode List
> Subject: RE: UTF-8 BOM Nonsense
> At 11:31 AM 06/22/2000 -0800, Michael Kaplan (Trigeminal Inc.) wrote:
> >I do not believe that this will require it to be added to a standard, and
> >this is a non-standard usage, but life is about dealing with things as
> they
> >are (and this is how they are!).
> I assume that you also feel that the charset parm on a MIME Email Header
> (or HTML/XML header) is not needed and thus should be discouraged. The use
> of the BOM character at the start of a TEXT file serves the same purpose
> as
> the charset tag - It says "I am in UTF-8 format" (so you do not try to
> treat it as ISO-8859-x, CP1252, or some other encoding format).

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT