Re: Windows and Mac character encoding questions

From: Ernest Cline (ernestcline@mindspring.com)
Date: Tue Mar 30 2004 - 01:26:56 EST

  • Next message: Pavel Adamek: "Re: Printing and Displaying Dependent Vowels"

    > [Original Message]
    > From: John Cowan <cowan@ccil.org>
    >
    > Mark Davis scripsit:
    >
    > > Some more details. Usually, by 'extension' one means a superset of
    > > the mappings. windows-1252 is formally disjoint from iso-8859-1 --
    > > not a superset -- since it has mappings for 0x80..0x9F which are
    > > different from iso-8859-1's mappings for the same bytes.
    >
    > I don't have access to ISO 8859-1 itself, but ECMA-94 (1986), which is
    > supposed to be equivalent, doesn't actually define anything for
    0x80..0x9F.
    > So I think the term "superset" is in fact justified.

    ECMA-94 says nothing about the C1 control set, it specifies only the
    G0 and G1 graphics sets, but ECMA-43 (ISO 4873) does. The octets
    08/14 and 08/15 if present are only allowed to be used for the SS2
    and SS3 control functions according to ECMA-43. If ISO 8859 says
    anything about the control sets, I think it is safe to say that at the very
    least it references ISO 4873. In that case, the windows-1252 use of
    0x8E as LATIN CAPITAL LETTER Z WITH CARON would violate
    that standard. Also RFC 1345 indicates that the standard C0 and C1
    control sets of ISO 6429 (ECMA-48) are used with ISO 8859-1, but I
    can't be certain if that is just the usual assumption or explicitly given
    in ISO 8859.

    In any case, windows-1252 is not ISO-2022 (ECMA-35) friendly.and
    given the existence of LATIN CAPITAL LETTER Z WITH CARON
    as 0x8E, it certainly does not fit nicely into the ECMA/ISO family
    of interrelated character set standards, even if one overlooks the
    fact that it uses graphics characters in the C1 control set.



    This archive was generated by hypermail 2.1.5 : Tue Mar 30 2004 - 02:00:49 EST