Re: Undefined code positions in 8-bit character sets

From: Doug Ewell (
Date: Mon May 05 2008 - 20:06:56 CDT

  • Next message: Kenneth Whistler: "Re: Undefined code positions in 8-bit character sets"

    Kenneth Whistler <kenw at sybase dot com> wrote:

    >> On the other hand, Windows-1252 might be extended again and assign a
    >> meaning to 0x90, so it is probably better not to map any Unicode
    >> codepoint to that value.
    > I disagree. If you do not map U+0090 to 0x90 for Windows-1252, all you
    > are doing in ensuring an interoperability bug both with Windows and
    > with other commercial applications doing conversions.

    If you are working in either ISO 8859-1 or Windows-1252, and encounter
    the byte 0x90, you've got problems already. You might do well to ask
    yourself whether your text is even in one of those encodings, or whether
    it is mislabeled or a bad assumption was made.

    Doug Ewell  *  Arvada, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

    This archive was generated by hypermail 2.1.5 : Mon May 05 2008 - 20:09:32 CDT