From: Mark Davis (email@example.com)
Date: Sun Nov 03 2002 - 15:29:32 EST
I don't know what you are trying to say. Perhaps you could explain it at the
meeting next week.
► “Eppur si muove” ◄
----- Original Message -----
From: "Michael (michka) Kaplan" <firstname.lastname@example.org>
To: "Mark Davis" <email@example.com>; "Murray Sargent"
<firstname.lastname@example.org>; "Joseph Boyle" <Boyle@siebel.com>
Sent: Saturday, November 02, 2002 04:18
Subject: Re: Names for UTF-8 with and without BOM
> From: "Mark Davis" <email@example.com>
> > That is not sufficient. The first three bytes could represent a real
> > character, ZWNBSP or they could be a BOM. The label doesn't tell you.
> There are several problems with this supposition -- most notably the fact
> that there are cases that specifically claim this is not recommended and
> that U+2060 is prefered?
> > This is similar to UTF-16 CES vs UTF-16BE CES. In the first case, 0xFE
> > represents a BOM, and is not part of the content. In the second case, it
> > does *not* represent a BOM -- it represents a ZWNBSP, and must not be
> > stripped. The difference here is that the encoding name tells you
> > what the situation is.
> I do not see this as a realistic scenario. I would argue that if the BOM
> matches the encoding scheme, perhaps this was an intentional effort to
> sure that applications which may not understand the higher level protocol
> can also see what the encoding scheme is.
> But even if we assume that someone has gone to the trouble of calling
> something UTF16BE and has 0xFE 0xFF at the beginning of the file. What
> of content *is* such a code point that this is even worth calling out as a
> special case?
> If the goal is to clear and unambiguous text then the best way would to
> simplify ALL of this. It was previously decided to always call it a BOM,
> not stick with that?
This archive was generated by hypermail 2.1.5 : Sun Nov 03 2002 - 15:59:53 EST