RE: Roundtripping in Unicode

From: Lars Kristan (
Date: Wed Dec 15 2004 - 07:57:14 CST

  • Next message: Lars Kristan: "RE: Roundtripping in Unicode"

    Marcin 'Qrczak' Kowalczyk replied:
    > "Arcane Jill" <> writes:
    > > If so, Marcin, what exactly is the error, and whose fault is it?
    > It's an error to use locales with different encodings on the same
    > system.

    Ummmm, and whose fault is it?

    You can advise the users against it, but they won't necessarily listen.

    Switching to UTF-8 on UNIX opens two possibilities:

    1 - Users that HAD different encodings on the same system will now only have
    one, namely UTF-8.
    2 - Users that didn't have different encodings now may end up with different
    (and quite incompatible) encodings.

    Assuming everything will happen quickly, and on all systems is ... well,

    Once it happens, offending filenames should be rare. One could creep in for
    various reasons, not limited to malicious attempts.

    Automated or assisted upgrades to UTF-8 have been mentioned. For those that
    will be able to use them, great. I would even go a step further. I would
    icorporate a switch into UNIX filesystems that would enable a validator.
    This validator would reject invalid UTF-8 filenames from being created to
    start with (along with some other characters). This is quite un-UNIX-like,
    but then so it UTF-8. Perhaps then we can declare UNIX filenames as text.
    Well, for the most part. Except for some applications that WILL still need
    to be able to access all files even on systems whose users will not decide
    (perhaps for valid reasons!) to enable that validator.


    This archive was generated by hypermail 2.1.5 : Wed Dec 15 2004 - 08:06:32 CST