RE: UTF-8N?

From: Michael Kaplan (Trigeminal Inc.) (v-michka@microsoft.com)
Date: Tue Jun 20 2000 - 12:16:02 EDT


Danger is a relative term, I think.

Windows 2000 Notepad includes one so that it can easily recognize a file you
saved as UTF-8 actually being UTF-8 the next time you load it. If you remove
it, then obviously Notepad may not be able to recognize the file as UTF-8.

You should obviously never display it, but if you delete it then you
knowingly remove a "tag" that at least one program felt it needed to add.
Being "neighborly" with other programs would imply that you would be nicer
to your neighbors. :-)

Michael

> ----------
> From: Antoine Leca[SMTP:Antoine.Leca@renault.fr]
> Sent: Tuesday, June 20, 2000 8:56 AM
> To: Unicode List
> Cc: Unicode List
> Subject: Re: UTF-8N?
>
> Mark Davis wrote:
> >
> > The reason I make that notational distinction in the text is that there
> is a danger
> > with UTF-8 currently: BOM can be used with it, and some people do.
> Since, unlike
> > the case of UTF-16 / UTF-16BE / UTF-16LE, there is no way to distinguish
> between
> > implementations that allow a BOM and those that don't, the situation is
> slightly
> > unstable: if you find EF BB BF at the start of a UTF-8 file, you don't
> know whether
> > to delete it or not.
>
> I understand there is no way to know whether you SHALL/SHOULD/MAY delete
> it or not,
> but I fail to see the danger: BOM (well, ZWNBSP) cannot carry any useful
> meaning
> when it appears at the beginning of a text, can it? So what can be the
> problem?
>
> What am I missing?
>
>
> Antoine
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT