Re: 8859-1, 8859-15, 1252 and Euro

From: John Cowan (
Date: Mon Feb 07 2000 - 18:05:54 EST

Tim Greenwood wrote:

> So what is a system that stores all data in Unicode and converts for web
> output to do with U+20AC? The formally correct process would seem to be to
> convert to 0x80 only for CP1252 (and the other CP12xx sets) to 0xa4 for
> ISO-8859-15 and to the 'not a character in this set' sign for ISO-8859-1.
> This may be formally correct, but would not help the majority of users. For
> that we would convert to 0x80 for ISO-8859-1 - it works even though 'wrong'.

The trouble with that scheme is that old CP1252 fonts, and non-CP1252
systems, will display a box or nothing. There is no really good solution
to this problem: sometimes 0xA4 makes the most sense (at worst it displays
the wrong glyph), sometimes 0x80. If further processing is to be done
at the client end (as opposed to just display) then some other solution
is preferable.

> Have others faced this? (Perhaps you just avoid the problems with the
> numeric or named character reference)

That pushes the problem off onto the client machine, at least. Using
a named reference causes clients that don't handle the euro symbol
to see "€" which may break table lineups but is intelligible.


Schlingt dreifach einen Kreis vom dies! || John Cowan <> Schliesst euer Aug vor heiliger Schau, || Denn er genoss vom Honig-Tau, || Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT