Re: Vietnamese (Re: Unicode, SMS, PDA/cellphones)

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Jun 03 2006 - 15:47:33 CDT

  • Next message: Doug Ewell: "Re: Vietnamese (Re: Unicode, SMS, PDA/cellphones)"

    Samuel Thibault <samuel dot thibault at ens dash lyon dot org> wrote:

    >>> IIRC the only needed letters (in addition to ASCII) are as follow:
    >>>
    >>> a A à À ả Ả ã Ã á Á ạ Ạ ă Ă ằ Ằ ẳ Ẳ ẵ Ẵ ắ Ắ ặ Ặ â Â ầ Ầ ẩ Ẩ ẫ Ẫ ấ Ấ
    >>> ậ Ậ d D đ Đ Đ e E è È ẻ Ẻ ẽ Ẽ é É ẹ Ẹ ê Ê ề Ề ể Ể ễ Ễ ế Ế ệ Ệ i I ì
    >>> Ì ỉ Ỉ ĩ Ĩ í Í ị Ị o O ò Ò ỏ Ỏ õ Õ ó Ó ọ Ọ ô Ô ồ Ồ ổ Ổ ỗ Ỗ ố Ố ộ Ộ ơ
    >>> Ơ ờ Ờ ở Ở ỡ Ỡ ớ Ớ ợ Ợ u U ù Ù ủ Ủ ũ Ũ ú Ú ụ Ụ ư Ư ừ Ừ ử Ử ữ Ữ ứ Ứ ự
    >>> Ự y Y ỳ Ỳ ỷ Ỷ ỹ Ỹ ý Ý ỵ Ỵ
    >>
    >> It is much simpler to describe the Vietnamese alphabet as:
    >> * the basic Latin letters (without any diacritic)
    >> * only the following 6 extended vowels: ă â ê ô ơ ư (in addition to
    >> the 6 basic vowels: a e i o u y)
    >
    > Indeed, but people can check that the above letters got encoded with
    > single Unicode characters.

    Conceptually, Philippe is correct. In the Vietnamese language, there
    are 12 base vowels to which one of the tone marks (or none) can be
    applied.

    From an encoding standpoint, Samuel is correct about this example. Each
    letter is in fact encoded as a single precomposed character.

    There is sometimes a gap between the way people view their orthography
    and the way it is encoded, and it's important to make sure we who build
    world-aware software can bridge that gap.

    --
    Doug Ewell
    Fullerton, California, USA
    http://users.adelphia.net/~dewell/
    


    This archive was generated by hypermail 2.1.5 : Sat Jun 03 2006 - 15:55:22 CDT