Re: UTF-8N?

From: John Cowan (jcowan@reutershealth.com)
Date: Tue Jun 20 2000 - 15:41:50 EDT


Juliusz Chroboczek wrote:

> At one point, I thought that with Unicode there would be only one
> cross-platform encoding, and that a plain text file from a Mac and a
> plain text file from a Windows machine would be the same thing (up to
> some uninteresting variations in line ending).
>
> Later, I though that there would be two Unicode encodings, the ones
> that are now called UTF-16BE and UTF-8N. I was prepared to live with
> that.
>
> Right now, it looks like there will be at least 8 Unicode encodings,
> at least 4 of whic will be in common use (big-endian UTF-16, UTF-16BE,
> UTF-8N, UTF-8). What is worse, some of these formats, including the
> most common one, will have to be treated specially when applying
> mundane operations such as splitting a file.

I think that the variations in BOM are just as "uninteresting" as the
variations in line ending: it is as easy to write programs to ignore
the one as the other, and stupid programs will blow up on the one
as on the other.

Substantively, the only real encodings are UTF-8 and UTF-16.

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT