Re: unicode on Linux

From: Stephane Bortzmeyer (bortzmeyer@nic.fr)
Date: Thu Oct 23 2003 - 02:47:46 CST

Next message: Stephane Bortzmeyer: "Re: unicode on Linux"
Previous message: jarkko.hietaniemi@nokia.com: "[OT] RE: GDP by language"
In reply to: Edward H. Trager: "Re: unicode on Linux"
Next in thread: Stefan Persson: "Re: unicode on Linux"
Reply: Stefan Persson: "Re: unicode on Linux"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Tue, Oct 21, 2003 at 11:32:28AM -0400,
Edward H. Trager <ehtrager@umich.edu> wrote
a message of 118 lines which said:

> I think there can be big debates about whether a Linux (or any *nix
> kernel, for that matter) has any business normalizing file names.
> Personally I think Unicode normalization is not the kernel's
> business. This is better left to the userland applications.

I do not agree. It would mean *each* application has to normalize
because it cannot rely on the kernel. It has huge security
implications (two file names with the same name in NFC, so visually
impossible to distinguish, but two different string of code points).

Normalization has to be done in the kernel for the same reason than
access control (the rwx bits in Unix) has to be in the kernel: so that
no application can bypass it.

> Are you sure about ls? ls should sort UTF-8-encoded file names in
> raw Unicode order, n'est-ce pas?

Yes, but this has no meaning (in French, é should not be after z).

> What about ICU's regexp package?
> (http://oss.software.ibm.com/icu/userguide/regexp.html) You should
> be able to use ICU on *any* platform. Linux does not yet having a
> Unicode grep

I never said that Unix cannot be "Unicodized". I just saif that it is
not Unicodized. That's why I talked about an "act of faith". You need
to configure many things and to compile many things before you have a
working Unicode environment.

> I thought both Postgres and MySQL already have, or are working on
> this issue?

None of them have it. They claim "Unicode support" which means they
can just store and retrieve UTF-8.

Next message: Stephane Bortzmeyer: "Re: unicode on Linux"
Previous message: jarkko.hietaniemi@nokia.com: "[OT] RE: GDP by language"
In reply to: Edward H. Trager: "Re: unicode on Linux"
Next in thread: Stefan Persson: "Re: unicode on Linux"
Reply: Stefan Persson: "Re: unicode on Linux"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST