Re: _Unicode_code_page_and_?.net

From: Asmus Freytag <>
Date: Tue, 30 Jul 2013 13:05:27 -0700

On 7/30/2013 12:26 PM, Doug Ewell wrote:
> Buck Golemon <buck at yelp dot com> replied to Richard Wordingham
> <richard dot wordingham at ntlworld dot com>:
>>>> There are no Unicode code pages.
>>> Just to be pedantic, there are several on Windows. They encode the
>>> coding form (Unicode codes being best thought of as an assignment of
>>> natural numbers to characters, with certain approved ways of storing
>>> those numbers), e.g. Code pages 1200 (little-endian UTF-16), 1201
>>> (big-endian UTF-16), 12000 (little-endian UTF-32), 12001 (big-endian
>>> UTF-32), 65000 (UTF-7) and 65001 (UTF-8).
>> I shudder to imagine the circumstances that forced you to learn this
>> information.
> Most Windows .NET developers who are concerned about proper character
> handling would know this information existed, though they might not have
> the numbers memorized.
> Jukka was right, though: Unicode itself does not have code pages.
> Rather, at least one vendor has defined some of the Unicode encoding
> schemes as if they were code pages. A code page is not, in general, the
> same as an encoding scheme.
What is, then, the proper definition of a "code page"?

When Unicode was first introduced, it was seen as the one thing that
wasn't a "code page", because the way the Win32 API associated one of
the traditional code pages with Unicode (giving rise the "A" and "W"
versions of all the APIs).

Later, it was realized that in order to specify what encoding data were
in or, for example, to specify a conversion from UTF-7 and UTF-8 to
UTF-16 (native encoding scheme) one needed some suitable ID number to
identify the mapping. Well, extending the code page id was the most
natural way to do that, because, on several platforms, the use of a
numerical ID from the IBM code page registry was established practice.


> --
> Doug Ewell | Thornton, CO, USA
> | @DougEwell ­
Received on Tue Jul 30 2013 - 15:07:59 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 30 2013 - 15:07:59 CDT