Re: LC_CTYPE locale category and character sets.

From: John Cowan (cowan@locke.ccil.org)
Date: Thu Jul 16 1998 - 15:06:40 EDT


Keld J|rn Simonsen wrote:

> In principle not, in practice possibly. It is advocated that
> all character properties stay the same across character sets
> and language/country/culture. But in a culture there may be
> specific recommendations on what is considered eg. a letter, a digit,
> or a punctuation mark. In some cultures eg devanagari digits
> are recognised as digits, while in others these may just be
> considered some kind of strange special character. Also for
> punctuation marks, eg quotation marks vary widely from culture
> to culture.

The *preferred* quote mark varies, yes, but what *is* a quote
mark is invariable. I would never mark quotations with guillemets,
but I recognize guillemets as quotation marks.

> > But are there any known example of a LC_CTYPE character property
> > (isalpha, isupper, tolower, isdigit, isxdigit ...)
> > which changes or should change from one culture to another ?
>
> isupper/islower for Turkish is a prime example.
> Uppercase of initial "ij" in Dutch (becomes both uppercase)
> is another.

These are case *mappings*. Turkish has a specific case mapping
rule, but it shares the same case *properties* as every other
language, as to what is uppercase and what is lowercase.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT