Re: CLDR

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Tue May 16 2006 - 02:32:40 CDT

Next message: Philippe Verdy: "Re: Win IE 7b2 and UTF-8"

Previous message: Balasankar: "CLDR"
In reply to: Balasankar: "CLDR"
Next in thread: Asmus Freytag: "Re: CLDR"
Reply: Asmus Freytag: "Re: CLDR"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Tue, 16 May 2006, Balasankar wrote:

> Whether the union of Exemplar & auxiliary exemplar character set should
> contain all the possible characters used in the particular language?

No. It is impossible to list down the characters used in a language; the
set is very fuzzy, with membership ranging from core characters (such as
"a" in English) through marginal characters (like "é", i.e. "e" with
acute, in English) to characters may appear in special words, typically
borrowings, perhaps _very_ rarely. Moreover, these sets are currently
supposed to list down _letters_ only. The two sets make it possible to
give a rather rough description of letters used in a language, and the
choices made are often rather debatable.

It isn't even clear what the intended _use_ of the sets is, or what the
actual use will be. There is a large number of imagineable uses, with
their own implications on what the grounds for defining the sets should
really be. I'm afraid the (mostly implicit) criteria applied now make the
sets incommensurable across languages.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Next message: Philippe Verdy: "Re: Win IE 7b2 and UTF-8"
Previous message: Balasankar: "CLDR"
In reply to: Balasankar: "CLDR"
Next in thread: Asmus Freytag: "Re: CLDR"
Reply: Asmus Freytag: "Re: CLDR"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue May 16 2006 - 02:42:08 CDT