From: Doug Ewell (dewell@adelphia.net)
Date: Sat Jun 03 2006 - 15:47:33 CDT
Samuel Thibault <samuel dot thibault at ens dash lyon dot org> wrote:
>>> IIRC the only needed letters (in addition to ASCII) are as follow:
>>>
>>> a A à À ả Ả ã Ã á Á ạ Ạ ă Ă ằ Ằ ẳ Ẳ ẵ Ẵ ắ Ắ ặ Ặ â Â ầ Ầ ẩ Ẩ ẫ Ẫ ấ Ấ
>>> ậ Ậ d D đ Đ Đ e E è È ẻ Ẻ ẽ Ẽ é É ẹ Ẹ ê Ê ề Ề ể Ể ễ Ễ ế Ế ệ Ệ i I ì
>>> Ì ỉ Ỉ ĩ Ĩ í Í ị Ị o O ò Ò ỏ Ỏ õ Õ ó Ó ọ Ọ ô Ô ồ Ồ ổ Ổ ỗ Ỗ ố Ố ộ Ộ ơ
>>> Ơ ờ Ờ ở Ở ỡ Ỡ ớ Ớ ợ Ợ u U ù Ù ủ Ủ ũ Ũ ú Ú ụ Ụ ư Ư ừ Ừ ử Ử ữ Ữ ứ Ứ ự
>>> Ự y Y ỳ Ỳ ỷ Ỷ ỹ Ỹ ý Ý ỵ Ỵ
>>
>> It is much simpler to describe the Vietnamese alphabet as:
>> * the basic Latin letters (without any diacritic)
>> * only the following 6 extended vowels: ă â ê ô ơ ư (in addition to
>> the 6 basic vowels: a e i o u y)
>
> Indeed, but people can check that the above letters got encoded with
> single Unicode characters.
Conceptually, Philippe is correct. In the Vietnamese language, there
are 12 base vowels to which one of the tone marks (or none) can be
applied.
From an encoding standpoint, Samuel is correct about this example. Each
letter is in fact encoded as a single precomposed character.
There is sometimes a gap between the way people view their orthography
and the way it is encoded, and it's important to make sure we who build
world-aware software can bridge that gap.
-- Doug Ewell Fullerton, California, USA http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Sat Jun 03 2006 - 15:55:22 CDT