From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Dec 17 2004 - 05:43:58 CST
On 17/12/2004 10:13, Arcane Jill wrote:
> ...
>
> One last question - why /can't/ locale conversion be automated? I
> don't really get this one, but it's the root of this whole topic.
> Surely, if we make the following assumptions:
> (1) No user has a locale of UTF-8, and
> (2) Some users will have created UTF-8 filenames and UTF-8 text files,
> and
> (3) Some of those text files may have been concatenated, leading to
> mixed-encoding text files
> then we can surely automate everything. (Requirement (1) can be met
> simply by asking all users who have changed their locale to UTF-8 to
> change it back again, temporarily). ...
This locale change is not exactly simple for (future?) users who only
speak and use a language which is supported only by UTF-8 - which would
include most Indians and SE Asians for a start.
> Assuming these requirements, all you have to do is:
>
> ...
> # if (the file can be positively identified as a text file)
> # {
> # re-encode all non-UTF-8 substrings (assuming them to
> be in the user's locale) to UTF-8
This assumption is invalid. I have on my system a number of files which
are text files but encoded neither in UTF-8 nor in my own locale. I read
them either with programs which can display them according to their
locale (which is not encoded within the file) or by using substitution
fonts (which is justified because many of them were written for in such
obsolescent setups). This kind of automated conversion would cause
disastrous damage.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Fri Dec 17 2004 - 11:10:22 CST