Re: Euro character in ISO

From: Frank da Cruz (fdc@columbia.edu)
Date: Wed Jul 12 2000 - 15:49:24 EDT


On Wed, 12 Jul 2000 10:43:59 -0800, Robert A. Rosenberg wrote:
> At 08:56 PM 07/11/2000 -0800, Geoffrey Waigh wrote:
> >On Tue, 11 Jul 2000, Robert A. Rosenberg wrote:
> > > At 15:30 -0800 on 07/11/00, Asmus Freytag wrote:
> > > >There has been an attempt to create a series of 'touched up' 8859
> > > >standards. The problem with these is that you get all the issues of
> > > >character set confusion that abound today with e.g. Windows CP 1252
> > > >mistaken for 8895-1 with a vengeance:
> > >
> > > The problem would go away if the ISO would get their heads out of
> > > their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
> > > put the CP125x codes there.
> >
> >Except that would break all the systems that understand that C1 "junk,"
> >and a number of systems do so because they are adhering to other
> >ISO standards. If you are going to force someone to change their
> >datastreams to something new, they might as well go to some flavour
> >of Unicode anyways.
>
> Who is going to get broken if I say on my MIME header (or HTML) that my
> CHARSET is (example) ISO-8859-21?
>
We go through this exercise about twice a year. First, let's recognize
that ISO is not about to revoke Standards 4873 and 2022, so there's not
much point in suggesting it. Second, think of a terminal that complies
with these standards. A physical terminal such as a VT320. I am using it
to access my mail host in text mode, and I'm reading mail with (say) Unix
'mail'. The terminal does not interpret the MIME headers. It doesn't
parse HTML. It implements a very straightforward finite state automaton
that implements the ISO 2022 based terminal. Unix 'mail' sends to my
terminal the bytes of the message, period.

Perhaps you're suggesting the Unix 'mail' should become a translation
agent between the character set of the mail and that of the user's
terminal? I hope not, since given that practically any character set
anybody can dream up is "MIME-compliant" as long as it's tagged, then
every mail program must know how to convert from every character set in
existence to every other one. Or is it the mail transfer agent? Or both?
It's really quite a mess; let's not go out of our way to make it worse.

To understand the implications of using 8-bit character sets that contain
graphic characters in the C1 area FOR INTERCHANGE, imagine trying to do
the same thing to the C0 area.

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT