On 1997-07-18 John Cowan <cowan@ccil.org> wrote:
> When preparing a comparison table of the various 8859-x parts,
> I noticed an odd property of partial consistency between 8859-1-2-3-4.
> Any character encoded by more than one coded character set is
> always encoded at the same codepoint.  (8859-9 does not have
> this property; its Turkish letters don't agree with the 8859-3
> encoding.)
>
> [examples]
>
> What I'm wondering is whether this property was carefully designed
> into 8859-1-2-3-4 when they were specified, or whether it is more or
> less an accident of copying.
> 
> Does anyone know?
I don't know for certain, but I don't think it's accidental. Consider these
correspondences between ASCII and 8859-1:
 32             160        Non-breaking space
 33    !        161        Inverted exclamation
 35    #        163        Pound sterling
 36    $        164        General currency sign
 45    -        173        Soft hyphen
 48    0        176        Degree sign
 50    2        178        Superscript two
 51    3        179        Superscript three
 63    ?        191        Inverted question mark
(from http://www.pemberley.com/janeinfo/latin1.html)
(My Telnet program does not cut-and-paste 8859-1 properly, so I've deleted
the right-hand column.)
Notice by the way that the number sign and pound sterling sign share the
lower 7 bits. This is probably why I often see amounts of money written as
#19.95 here in the UK (electronically, that is). I think this is a more likely
explanation than that they share the word "pound", although this probably also
has some effect.
<sigh>
-- Daniel B
 
e'ocai ko sarji la lojban
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT