Re: LC_CTYPE locale category and character sets.

From: John Cowan (
Date: Thu Jul 16 1998 - 15:06:40 EDT

Keld J|rn Simonsen wrote:

> In principle not, in practice possibly. It is advocated that
> all character properties stay the same across character sets
> and language/country/culture. But in a culture there may be
> specific recommendations on what is considered eg. a letter, a digit,
> or a punctuation mark. In some cultures eg devanagari digits
> are recognised as digits, while in others these may just be
> considered some kind of strange special character. Also for
> punctuation marks, eg quotation marks vary widely from culture
> to culture.

The *preferred* quote mark varies, yes, but what *is* a quote
mark is invariable. I would never mark quotations with guillemets,
but I recognize guillemets as quotation marks.

> > But are there any known example of a LC_CTYPE character property
> > (isalpha, isupper, tolower, isdigit, isxdigit ...)
> > which changes or should change from one culture to another ?
> isupper/islower for Turkish is a prime example.
> Uppercase of initial "ij" in Dutch (becomes both uppercase)
> is another.

These are case *mappings*. Turkish has a specific case mapping
rule, but it shares the same case *properties* as every other
language, as to what is uppercase and what is lowercase.

John Cowan
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT