Re: UNIX Unicode APIs (*not* character sets)

From: David Starner (dvdeug@x8b4e53cd.dhcp.okstate.edu)
Date: Sun Aug 20 2000 - 18:24:15 EDT


On Mon, Jul 31, 2000 at 09:58:48AM -0800, jarkko.hietaniemi@nokia.com wrote:
> what is the status of various UNIXes and lookalikes as far
> as "Unicode objects", that is, anything named using Unicode
> encodings like UTF-8 or UTF-16XX, are concerned.
> The most likely candidate for such naming of course
> being filesystem objects, that is, filenames. I know that Win32 has
> something called "wide names" in NTFS. In UNIX lands NFSv4 (whenever it
> comes out...Solaris 9?) is supposed to have UTF8 filenames. But is there
> anything else out there/being planned?

Linux (and probably other Unixes - I don't know) accept arbitrary byte
sequences for filenames, so long as it doesn't include '/', '\0' and
probably the C0 characters. The userland programs interpret it in the
locale character set. Solaris and other Unixes have UTF8 locales, but
Linux won't really have UTF8 locales until glibc 2.2 comes out. The
extent of the API's on most Unixes (Un*x's too) are the various iconv
and mbtowc interfaces that treat Unicode as merely one of many character
sets (at least externally.)

-- 
David Starner - dstarner98@aasaa.ofe.org
http/ftp: x8b4e53cd.dhcp.okstate.edu
It was starting to rain on the night that they cried forever,
It was blinding with snow on the night that they screamed goodbye.
	- Dio, "Rock and Roll Children"



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:07 EDT