RE: data for cp1252

From: Shawn Steele <Shawn.Steele_at_microsoft.com>
Date: Sat, 8 Dec 2012 01:01:22 +0000

> In contrast, bringing the cp1252 definition into line with real implementations and recommending UTF-8 for new developments are not mutually exclusive.

Exactly?

If you already have existing data in 1252 or a variation (and can't tell them apart), then nothing's gained by making NEW requirements for 1252 which the old data won't conform to. Changing standards or behavior will only break things that already work.

If you're creating new data, it should be using UTF-8 to avoid these kinds of ambiguity.

-Shawn

On Fri, Dec 7, 2012 at 4:41 PM, Shawn Steele <Shawn.Steele_at_microsoft.com<mailto:Shawn.Steele_at_microsoft.com>> wrote:
It's a variation. The undefined codepoints in 1252 probably shouldn't be used, and I can't imagine that adding a code page helps anything, nor that changing an existing behavior helps anything. People really should be using UTF-8.

-Shawn

From: Buck Golemon [mailto:buck_at_yelp.com<mailto:buck_at_yelp.com>]
Sent: Friday, December 7, 2012 4:34 PM
To: Shawn Steele
Cc: unicode

Subject: Re: data for cp1252

I've been told that bestfit1252 wasn't meant to redefine the cp1252 mapping, although its first line declares "CODEPAGE 1252".

Is it a separate encoding or not?

If so, I'll submit a new "bestfit1252" to the python stdlib.
If not, I believe the cp1252 mapping needs brought into line.

On Fri, Dec 7, 2012 at 4:27 PM, Shawn Steele <Shawn.Steele_at_microsoft.com<mailto:Shawn.Steele_at_microsoft.com>> wrote:
J
Received on Fri Dec 07 2012 - 19:13:08 CST

This archive was generated by hypermail 2.2.0 : Fri Dec 07 2012 - 19:13:08 CST