From: Markus Scherer (firstname.lastname@example.org)
Date: Thu Jan 29 2004 - 11:50:46 EST
Hi Deepak, I recommend to keep this thread on the unicode list for a better chance of getting the
As I said in my earlier email, I would try the Windows command line window (DOS prompt window) and
set it to Unicode mode via "chcp 10000".
I just tried this on Windows 2000, and pasting Unicode characters (that are not in the OEM codepage)
from the character map does not work. It appears to perform a conversion from Unicode to the OEM
codepage (and then back out).
My other machine has Windows XP. There, the same experiment works - I can paste non-Latin-1 accented
Latin characters, Greek, the Euro symbol, etc.
I have not tried this on either machine with a non-English keyboard or IME.
I do not have other shells available on my Windows machines.
Microsoft people (and users) on the list should be able to give more tips.
Deepak Chand Rathore wrote:
> hi markus,
> do u know any shell through which we can enter 16 bit file names in windows
> as in Windows 2000, both FAT and NTFS use the Unicode character set for
> their names , but i am able to enter to enter
> 16 bit characters only through GUI.
> does such shell exist or not ?
> Thanks for ur ideas.
> -----Original Message-----
> From: Markus Scherer [mailto:email@example.com]
> Sent: Donnerstag, 22. Januar 2004 22:41
> To: unicode
> Subject: Re: problem - non-ASCII characters on Windows command line
> Your code looks like a Windows program.
> I recommend to use the WCHAR* version of main() itself - wmain() or _wmain()
> or similar. It's been a
> while since I did this... see MSDN for details.
> In other words, don't just use a char* version of main() and then try to
> convert to Unicode, but use
> the Unicode version of main() directly. You will then get WCHAR *argv
> right away.
> Also, try to not output to another non-Unicode codepage. In your case, you
> get input in the system
> "ANSI" codepage (which is the Windows non-Unicode codepage for legacy
> applications), and since you
> output to the console, your output is converted to the "OEM" codepage.
> At a minimum, try setting your console to Unicode (UTF-16LE) via "chcp
> 10000". Alternatively, try
> setting it to your "ANSI" codepage via "chcp 1252" or whatever is
> It would be better if you did not have to convert out to a non-Unicode
> codepage at all. For example,
> if the output is consumed by Notepad or another application (via a pipe or
> output redirect etc.),
> you could just output in UTF-8 (codepage 65001 on Windows, I believe) or
> UTF-16LE (byte-serialize
> your WCHAR*). I recommend to prepend U+FEFF to your output stream because
> many Windows applications
> recognize it as the Unicode signature.
> Best regards,
This archive was generated by hypermail 2.1.5 : Thu Jan 29 2004 - 12:42:47 EST