UTF8 file transfer and interoperability problem

From: Tay, William (William.Tay@usa.xerox.com)
Date: Fri Jun 07 2002 - 10:31:52 EDT


I'd like to know how file transfer works, for filenames encoded in UTF8,
using FTP, Netware and SMB protocol. From what I know Win NT/2000 encode
filenames in UTF16 LE, right? So what happens when Windows receive the UTF8
filenames via file transfer from a Linux/Unix machine?

1. Using FTP, when I transfer déçu (UTF8 bytes: 64 C3 A9 C3 A7 75)and kスk
(UTF8 bytes: 6B E3 82 B9 6B) from a Linux to Win NT machine, I see déçu
and kスk respectively in Explorer. It seems that Windows is interpreting
each byte as a character (using CP1252 on an English machine?). Question is
does any of the above protocols support character encoding so that Windows
may implicitly convert the UTF8 bytes to UTF16? If not what is the
recommended way to achieve such interoperability?

2. I mounted a Linux directory, that contains the two files above, on Win NT
and see that they are displayed as d├¬├ºu and kπé¦k in Explorer. Any
explanation why they appear differently from what's above?



This archive was generated by hypermail 2.1.2 : Fri Jun 07 2002 - 08:58:13 EDT