From: Arcane Jill (email@example.com)
Date: Wed Dec 15 2004 - 04:27:07 CST
From: firstname.lastname@example.org On Behalf Of Philippe Verdy
Sent: 14 December 2004 22:47
To: Marcin 'Qrczak' Kowalczyk
Subject: Re: Roundtripping in Unicode
>From: "Marcin 'Qrczak' Kowalczyk" <email@example.com>
>> "Arcane Jill" <firstname.lastname@example.org> writes:
>>> If so, Marcin, what exactly is the error, and whose fault is it?
>> It's an error to use locales with different encodings on the same
I confess I don't know much about Unix, but still, I'm not sure your
assertion (Marcin) makes sense. Unix is a multi-user system. If you log on
as User A, then User B's settings are hidden from you, unless User B has
explicitly decided to share them. It may even be possible that there may be
users of whose existence you are not even aware. Unix makes is possible for
/you/ to change /your/ locale - but by your reasoning, this is an error,
unless all other users do so simultaneously. Your reasoning implies that no
Unix user should ever change their locale unless they have an absolute
guarantee that all other users are going to do so simultaneously ... but I
don't know if you can ever get such a guarantee. Or maybe you're saying that
the error lies with Unix itself. Maybe that's fair comment, but I gather
Unix was invented before Unicode, so it can hardly be blamed for breaking
Unicode's conceptual model.
But it goes beyond that. Copy a file onto a floppy disc and then physically
take that floppy disc to a different Unix machine and log on as "guest" and
insert the disc ... Will the filename look the same? It would seem that "the
same system", is effectively every Unix machine on the planet, since files
may be interchanged between them.
The obvious solution is for all Unix machines everywhere to be using the
same locale - and it had better be UTF-8. But an instantaneous global
switch-over is never going to happen, so we see this gradual switch-over ...
and it is during this transition phase that Lars's problem manifests.
>More simply, I think that it's an error to have the encoding part of any
which again attaches blame to Unix itself. All very "not my problem", but I
think Lars has found that it actually /is/ his problem. (Not that I support
>The system should not depend on them, and for critical things like
>filesystem volumes, the encoding should be forced by the filesystem itself,
>and applications should mandatorily follow the filesystem rules.
Of course, you are suggesting not /really/ suggesting that the Unix kernel
be rewritten. But it's hard to for me to see how else this could be
>Now think about the web itself: it's really a filesystem, with billions
>users, or trillion applications using simultaneously hundreds or thousands
>of incompatible encodings... Many resources on the web seem to have valid
>URLs for some users but not for others, until URLs are made independant to
>any user locale, and then not considered as encoded plain-text but only as
>strings of bytes.
Oh yeah - and that too. Well spotted.
This archive was generated by hypermail 2.1.5 : Wed Dec 15 2004 - 04:33:38 CST