RE: Filenames with non-Ascii characters

From: Jungshik Shin (
Date: Tue Feb 24 2004 - 03:13:27 EST

  • Next message: Alan Wood: "RE: websites"

    On Mon, 23 Feb 2004, Dipti Srivastava wrote:

    > Dipti Srivastava asked:
    > > If I set my LC_TYPE to en_US.UTF8 do I need to convert the non-Ascii
    > > characters like
    > > '\' in the filename for functions like open, etc.

     It seems like your original question led Ken to believe you're working
    on POSIX (although Win 2k/XP has a POSIX subsystem, one usually doesn't work
    at that 'level' on Windows).

    > What if the filename contains contains Japanese characters e.g. the Japanese
    > file separator.

     Well, here's a fun part. Windows standard truetype fonts for Japanese and
    Korean have 'YEN' sign and 'WON' sign for U+005C in the truetype Unicode
    Cmap (PID=3, EID=1) instead of 'reverse solidus', which apparently got
    you confused. I wrote a couple of years ago on this list why it's a
    very bad practice and how it can eb dealt with. I also wrote to MS about
    the issue. Unfortunately, they put up an MSDN KB article defending their
    practice instead of doing what I 'believe' is the right thing.

      Now, I'm really confused what your target OS is because locale
    names like 'en_US.UTF-8' indicates that you work on POSIX, but '\' is
    the file separator for Windows. If your target OS is Windows, as long
    as you use Windows 'W' APIs (on Win9x/ME, MSLU - MS Layer for Unicode -
    offers 'emulation' of 'W' APIs), you don't have to worry about it.


    This archive was generated by hypermail 2.1.5 : Tue Feb 24 2004 - 03:58:26 EST