Re: Character classification and casing and locales

From: Keld J|rn Simonsen (
Date: Tue Nov 26 1996 - 20:14:24 EST

unicode@Unicode.ORG writes:

> There are two separate questions:
> 1. Is anyone working on Locale support for Unicode encoded strings?
> I suppose people are doing such things, but I don't have a definitive answer.
> There are some tables that the consortium provides for DEFAULT character
> classification, etc.

I am working on 10646 support in C, is that close enough?
I am working off POSIX specs for the DEFAULT character classification etc.

> 2. Are there any better ways to do [upper/lower-casing, etc], and other locale
> dependent character operations?

We are assuming a paradigm of coded character set independence, I think
that is a better approach.

> The Unicode FTP site, and the standard, provides a default upper/lower case
> table. I think that this particular operation is typically the same
> everywhere, with the exception of the dotted upper-case "I" in Turkish. You
> could use the table as a default, if you don't have other information you'd
> prefer to use.

the site with ISO POSIX (WG15) collection of charmaps and locales also
does this, and also follws the observation that you haev, that
the case conversion seems to be quite uniform across cultures.
( is the place)


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT