Re: Representing Unix filenames in Unicode

From: Richard Wordingham (
Date: Tue Nov 29 2005 - 02:31:46 CST

  • Next message: Antoine Leca: "[META] Re: ZWNJ in IDN (Burmese Issues)"

    Chris Jacobs wrote:

    > Marcin 'Qrczak' Kowalczyk wrote:
    >> "Doug Ewell" <> writes:

    >> So how do you propose to map filenames to strings on Unix?

    > How about quoted-printable?

    > "Quoted-printable encoding is one method used for mapping arbitary bytes
    > into sequences of ASCII characters. This encoding is reversible, meaning
    > the
    > original bytes and hence the non-ASCII characters they represent can be
    > recovered."

    Or some backslash notation. The byte 07 may have a valid representation as
    U+0007, but it is not particularly friendly for typing - let alone 0C. Some
    backslash notation may be appropriate for these cases.

    Are you sure you need a *unique* string for a byte sequence? Consider links
    in UNIX, or the albeit doomed OpenVMS file-naming system. Files already
    have multiple names. Also, a canonicalisation function (and I don't mean a
    one of the Unicode canonicalisations) would be much friendlier for input -
    it is not always easy to type in characters unsupported by the current


    This archive was generated by hypermail 2.1.5 : Tue Nov 29 2005 - 03:01:00 CST