From: Benjamin Peterson (ben@jbrowse.com)
Date: Mon Apr 05 2004 - 03:44:07 EDT
Versions up until Windows 2000 use UCS-2 internally. 2000 and XP use
UTF-16, although applications tend to have differing levels of awareness
about surrogates.
Regardless of whether UCS-2 or UTF-16 is used, Microsoft documentation
always refers to any unicode encoding as 'Unicode'. I attribute this to
the same magical field that makes otherwise sensible people say things
like 'a Unicode character is a 16-bit number' or 'UTF-8 is an efficient
way to store text'.
Benjamin
On Mon, 5 Apr 2004 10:06:07 +0530, "Mahesh T. Pai" <paivakil@vsnl.net>
said:
> Dan Smith said on Fri, Apr 02, 2004 at 03:04:22PM -0500,:
>
> > 1) The documentation we've found for Unicode support in Windows seems vague on
> > how Unicode is implemented. A good deal of it seems to imply that a character
> > is always represented by exactly two bytes, no more, no less, under all
> > conditions. And the specific term UTF-16 doesn't seem to be employed. Precisely
> > what Unicode encoding is employed by Windows (specifically Win2K, Windows
> > Server 2003, and WinXP)? Is it, in fact, UTF-16, including the use of surrogate
> > pairs? Or is it something older, or a subset, or some Microsoft variation,
> > restricted to 65,536 characters?
>
> Should'nt this be asked on one of Microsoft's forums??
>
> I am interested in the answers, though.
>
>
> --
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
>
> Mahesh T. Pai, LL.M.,
> 'NANDINI', S. R. M. Road,
> Ernakulam, Cochin-682018,
> Kerala, India.
>
> http://paivakil.port5.com
>
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
>
-- Benjamin Peterson bjsp123@imap.cc
This archive was generated by hypermail 2.1.5 : Mon Apr 05 2004 - 04:35:02 EDT