Tibetan (was: Re: A basic question on encoding Latin characters)

From: Christopher John Fynn (cfynn@dircon.co.uk)
Date: Wed Sep 29 1999 - 23:04:30 EDT

Next message: peter_constable@sil.org: "Re: Terminology question: character-like thing"
Previous message: Kenneth Whistler: "The politics of Unicode (was: an endlessly coruscating thread on a basic question)"
In reply to: Scott Horne: "Re: A basic question on encoding Latin characters"
Next in thread: Kevin Bracey: "RE: RE: A basic question on encoding Latin characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Scott Horne <shorne@metaphasetech.com> wrote:

> Christopher John Fynn wrote:

> > I think this is unfair. For instance I know that over several years
members
> > of > > the UTC and WG2 have spent a great deal of time and effort on the
Tibetan
> > block encoding - which was difficult because of the peculiarities of the
> > script,

> It's hardly more peculiar than Devanâgarî.

In fact once you get right into it I think you will find Tibetan is
considerably more complex than Devanagri - at least modern incarnations of
Devanagri. This is one of
the reasons it was eventually found necessary to adopt a very different
model in the Tibetan block than that used for other Indic scripts.

> > it's poor documentation,

> Tibetan is poorly documented? I suppose so, if you're looking for a handy
> little mass-market paperback guidebook that can be found in any bookshop.
> If you look a little harder, and perhaps even take the time (fancy!) to
> learn a bit about the Tibetan language, you'll find abundant information.

Yes it is poorly documented. There was e.g. no list anywhewre of all the
combinations that occur in Tibetan - I know there are rules about which
characters can combine with each other in proper Tibetan and in Sanskrit
written in Tibetan but all these rules
break down when it comes to e.g. writing Mongolian and Chinese words in
Tibetan,
combinations in certain astrological and medical texts and, especially in
forming Tibetan contractions (kung yig).

BTW I know Tibetan quite well - In fact I've been speaking and reading
Tibetan on a daily basis for 29 years now. I've also written a number of
different Tibetan applications and created several Tibetan fonts.

> > lack of standardisation
>
> The script itself is used in a standard way. Are you talking about lack
of
> standard implementations on computers? True enough--but this is the case
for
> most scripts.

No it is not always used in a standardised way when written in texts - this
is the sort of thing that is poorly documented.

> > and the fact that existing
> > (non-Unicode) implementations of Tibetan in software applications are
not
> > very satisfactory.

(including my own)

> The block encoding of Tibetan was also not very satisfactory the last time
> I saw it (2.0, I must admit). If I recall correctly, a number of recently
> created letters that are used in loan-words were missing.

Do look at 3.0 - this now allows you to encode any consonant stacked with
any other consonant (something which occurs in Tibetan though you won't find
it documented in any book on the Tibetan script or language - even native
ones) There are no real "recently created" characters - what you are
thinking of is probably the letters pa (U+0f54) and pha (U=0f55) combined
with tsa 'phru (U+0f39) which are used by Tibetans in India for
translitterating the sounds "va" and "fa" - this is not in fact new
though in Tibetan texts published in the PRC other ways of translitterating
these sounda are used. In old texts and in contractions you will also find
tsa-'phru
combining with several other consonants.

As far as Tibetan vowels and consonants everything is now covered as is most
punctuation The only "missing" Tibetan characters are some unusual
characters found in some rNying-ma and Bon-po texts, some unusual
punctuation marks and carets, and some astrological symbols used in
almanacs. There is also probably a need for subjoined combining forms of
U+0f88 and U+0f89. However no one should hold off implementing Tibetan until
these characters are added though - you can make a very good implementation
of Tibetan using Unicode v3.0. These missing characters, if and when they
are added, should not effect the basic architecture or coding of any Tibetan
implementation based on v3.0.

BTW if you are really interested in Tibetan or have any questions about the
properties etc. of characters in the Tibetan block you might want to join
the Tibetan list tibex@unicode.org mentioned by Sarasvati. I think you can
ask Sarasvati to subscribe you.

Regards

- Chris

Next message: peter_constable@sil.org: "Re: Terminology question: character-like thing"
Previous message: Kenneth Whistler: "The politics of Unicode (was: an endlessly coruscating thread on a basic question)"
In reply to: Scott Horne: "Re: A basic question on encoding Latin characters"
Next in thread: Kevin Bracey: "RE: RE: A basic question on encoding Latin characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT