Re: Representing Unix filenames in Unicode

From: Philippe Verdy (
Date: Tue Nov 29 2005 - 10:01:30 CST

  • Next message: Philippe Verdy: "Re: Representing Unix filenames in Unicode"

    From: "Antoine Leca" <>
    > On Tuesday, November 29th, 2005 07:03Z, Chris Jacobs wrote:
    >> What happens when two files have different, but canonical equivalent,
    >> file names?
    > The operating system sees two different files (without any relationship
    > one
    > with the other), and you (the user, the "human") see two files with
    > apparently the same handle to grasp them (the same name).
    > My idea is that you are going to loose, so probably thou shalt not do
    > that.

    If the filenameismeant to be readable, yes, you won't be able to see the
    difference. But if you want to display a precise file name that canbeused
    for example as a program parameter or in an URL, the Unicode filename needs
    to beesacped using some convention:
    * The URL encoding convention will be useful for the web (or even locally in
    "file:" URLs). The web now generally assumes that URLs should be encoded
    with UTF-8.
    * The shell escaping mechanism will be useful on Unix (need to escape
    backslashes, quotes, controls...) ifyou want that this Unicode string fits
    in a 8-bit "char" string in a command line.
    * In command line parameters, the caller still can specify the encoding
    usable to display meaningfully that escape-encoded binary parameter.

    This archive was generated by hypermail 2.1.5 : Tue Nov 29 2005 - 12:02:16 CST