Re: Wide Characters in Windows and UTF16

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Thu Aug 12 2004 - 11:19:12 CDT

Next message: John Cowan: "Re: Combining across markup?"

Previous message: Anto'nio Martins-Tuva'lkin: "Re: Combining across markup?"
In reply to: Rick Cameron: "RE: Wide Characters in Windows and UTF16"
Next in thread: Rick Cameron: "RE: Wide Characters in Windows and UTF16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Rick Cameron wrote:
> Microsoft Windows uses little-endian byte order on all platforms. Thus, on
> Windows UTF-16 code units are stored in little-endian byte order in memory.
>
> I believe that some linux systems are big-endian and some little-endian. I
> think linux follows the standard byte order of the CPU. Presumably UTF-16
> would be big-endian or little-endian accordingly.

This is somewhat misleading. For internal processing, where we are talking about the UTF-16 encoding
form (quite different from the external encoding _scheme_ of the same name), we don't have strings
of bytes but strings of 16-bit units (WCHAR in Windows). Program code operating on such strings
could not care less what endianness the CPU uses. Endianness is only an issue when the text gets
byte-serialized, as is done for the external encoding schemes (and usually by a conversion service).

markus

Next message: John Cowan: "Re: Combining across markup?"
Previous message: Anto'nio Martins-Tuva'lkin: "Re: Combining across markup?"
In reply to: Rick Cameron: "RE: Wide Characters in Windows and UTF16"
Next in thread: Rick Cameron: "RE: Wide Characters in Windows and UTF16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Aug 12 2004 - 11:26:17 CDT