Undefined code positions in 8-bit character sets

From: Andreas Prilop (prilop2008@trashmail.net)
Date: Mon May 05 2008 - 10:30:37 CDT

  • Next message: Richard Wordingham: "Re: Undefined code positions in 8-bit character sets"

    I refer to
     http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
     http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

    In ISO-8859-1,   code position 0x90 is mapped to U+0090.
    In Windows-1252, code position 0x90 is listed as "undefined".

    Why are they treated differently?
    International Standard ISO/IEC 8859-1 does *not* define
    code position 0x90. So it might also be listed as "undefined".

    Or, for purely practical reasons, 0x90 in Windows-1252 might
    also be mapped to U+0090.

    This different behaviour for undefined code positions may
    occasionally cause trouble - please see
     http://lists.w3.org/Archives/Public/www-validator/2008Apr/
     http://lists.w3.org/Archives/Public/www-validator/2008May/
    Thread "Fallback to UTF-8".



    This archive was generated by hypermail 2.1.5 : Mon May 05 2008 - 10:34:32 CDT