Re: UTF-8N?

From: Peter_Constable@sil.org
Date: Wed Jun 21 2000 - 11:32:53 EDT

Next message: Doug Ewell: "Re: How to distinguish UTF-8 from Latin-* ?"
Previous message: Daniel Biddle: "Re: How to distinguish UTF-8 from Latin-* ?"
Maybe in reply to: Masahiko Maedera: "UTF-8N?"
Next in thread: John Cowan: "Re: UTF-8N?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 06/20/2000 08:20:53 PM <dewell@compuserve.com> wrote:

[snip]

>It may be useful shorthand to define the term "UTF-8N" to refer to UTF-8
text
>that does not begin with a BOM, and reserve the term "UTF-8" for text that

>*does* begin with a BOM,

"UTF-8" currently does not, and so should not, be used to indicate the
definite presence of a BOM.

>but the fact is that both are really UTF-8, and people
>will use the term "UTF-8" to refer to both.

And rightly so.

> Adding (let alone registering) a
>new charset name to express this relatively minor difference will make it
look
>(as it does to Juliusz) like there are more Unicode encoding forms than
there
>really are.

We don't want distinct encoding schemes (schemes, I think, not forms) for
the UTF-8 encoding form that are distinguished by the presence or the
absence of a BOM. Presence or absence of a BOM doesn't constitute a
difference in encoding scheme for UTF-8, or even for UTF-16, for that
matter, because it is something separate from the character stream itself.
UTF-8 files both with and without a BOM serialize the character
representations into bytes (octets) in exactly the same way. That's the
basis for distinguishing between encoding schemes, and since there isn't a
difference, there is only one encoding scheme involved in both cases.

Peter Constable

Next message: Doug Ewell: "Re: How to distinguish UTF-8 from Latin-* ?"
Previous message: Daniel Biddle: "Re: How to distinguish UTF-8 from Latin-* ?"
Maybe in reply to: Masahiko Maedera: "UTF-8N?"
Next in thread: John Cowan: "Re: UTF-8N?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT