Re: Tibetan Collation

From: Mark Davis (mark@macchiato.com)
Date: Sat Dec 13 2008 - 17:11:52 CST


(adding unicode.org, since this might be relevant to some there)
The main collation charts are for the UCA, which is a language-neutral
ordering of all Unicode characters. In the UCA ordering is not necessarily
correct for any language using any particular script, since the
complications required to get correct sorting typically have to be on a
language-by-language basis. You happen to be looking at the chart for
characters in the Tibetan script using the UCA.

As yet, we don't show collation charts for the language-specific rules in
CLDR. However, there is no locale data for Tibetan [bo] in any event, and we
wouldn't add collation data if there isn't at least minimal general locale
data.

There is locale data for Dzongkha [dz] (
http://www.unicode.org/cldr/data/common/main/dz.xml), although minimal, and
a collation sequence for that: see
http://www.unicode.org/cldr/data/common/collation/dz.xml. The status is
draft="unconfirmed", because there has not been enough participation. CLDR
locales work somewhat like an open-source project - locale data is developed
and enhanced based on the interest of participating organizations in doing
the work for the particular language. This can be Unicode members, but also
liaison organizations
http://www.unicode.org/consortium/memblogo.html#liaisand others. If
you have any further questions, please let me know.

Mark

On Sat, Dec 13, 2008 at 06:58, Christopher Fynn <cfynn@gmx.net> wrote:

> The current collation chart for Tibetan <
> http://www.unicode.org/charts/collation/chart_Tibetan.html>
> looks *completely* broken as it does not handle prefixes, rago,
> lago, sago, etc.
>
>
> Tibetan should be almost identical to Dzongkha
> e.g. <http://developer.mimer.com/charts/dzongkha.htm>
> which also gives a correct collation for Tibetan.
>
>
>
> - Chris
>
>
>



This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST