Re: Additions to code page 1252

From: Lee Fryer-Davis (lfryerda@teklogix.com)
Date: Thu Jul 10 1997 - 14:52:08 EDT


At 10:23 am 10/07/1997 -0700, Unicode Discussion wrote:
>>
>> We plan on adding the following 2 mappings to the Windows code page 1252
>> definition so that this encoding will be a superset of the new ISO 8859
>> encoding.
>>
>> 0x8e = U+017d Latin Capital Letter Z With Caron
>> 0x9e = U+017e Latin Small Letter Z With Caron
>>
>> Thanks, Lori Brownell (LoriBr@Microsoft.com)
>>
>
>First of all, thanks for posting this information to the Unicode
>list. It is helpful to everyone to get advance notice of additions
>like this. I presume this is in addition to the addition of 0x80 =
>U+20AC EURO SIGN.
>
>And I don't want to seem like wanting to shoot the messenger, but
>Lee Fryer-Davis and Tony Harminc have raised good points about this.
>

Just a quick cooment---I am not trying to shoot the messenger here, since I do appreciate the advance notice Lori is giving everyone. My only concern is if this has happened other times and if so where can I find the information. Certainly this has happened in other companies (DEC made a set of quite different changes between some code pages in VT200, VT300 and VT500 series of terminals without changing the documentation or providing versioning---it took me several days to finally find out what was going on) so I am not Microsoft bashing, just trying to clarify the situation and make others aware of it as well.

>Microsoft's approach to the Windows code pages seems to be that
>it is o.k. to grow them by gradual accretion of new characters
>filling in the gaps, without changing the code page identity or
>providing any visible (or even documented) versioning.

The reference I made in my earlier e-mail is about an actual shift of codepoints within the code page, not just filling holes in the page as this latest change is doing. I was hoping that didn't happen in any other pages or the documentation I have may be incomplete (which eventually causes problems and complaints at customer sites due to inconsistences---a thing I try and avoid :-)

>While for
>many uses this is unobjectionable and provides a means of satisfying
>customer and/or vendor needs in an incremental way, for other
>purposes it causes major trouble.
>
>In particular, there has been a lot of discussion on this list just
>recently regarding the identification of CP1252 as a "charset" on
>the Internet. The way the MIME charset is defined, a change in
>repertoire for a character set implies a change in "charset", since
>it changes the way an octet stream is mapped into characters. The
>current situation on the Internet is that there are 2 CP1252's,
>and with these additions, now 3 CP1252's, all of which are chaotically
>and indeterminantly related to ISO-8859-1, assumed as default by
>many browsers on many platforms. The net result is interoperability
>problems for the 0x80..0x9F characters in CP1252, and many visible
>"bugs" in documents containing Windows 1252 characters.
>
>Given that Microsoft has a major impact as a "producer" of
>code pages, and is also a major player in web browsers and web
>authoring tools, is there any possibility that Microsoft could take
>some ownership of this character set identity and versioning problem
>and get involved with the IETF folks wrestling with the IANA
>"charset" registry that is referenced by Internet standards?
>A little proactivity in the area might save a lot of Microsoft
>bashing in the Internet arena.
>
>--Ken Whistler
>
>

Lee Fryer-Davis
Teklogix, Inc.
lfryerda@teklogix.com



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT