RE: Unicode conformant character encodings and us-ascii

From: Addison Phillips [wM] (aphillips@webmethods.com)
Date: Fri May 16 2003 - 14:49:16 EDT

  • Next message: Kenneth Whistler: "Re: character groupings in various languages"

    UTF-7, BOCU, SCSU, various ACEs and the rest are all "Transfer Encoding
    Syntaxes" (TES), according to the definition in UTR#17 (Character Encoding
    Model). In fact, that's a good UTR to look at for all this terminology: it
    covers the various character encoding forms (3), character encoding schemes
    (7), and what consitutes a transfer encoding syntax.

    Best Regards,

    Addison

    Addison P. Phillips
    Director, Globalization Architecture
    webMethods, Inc.

    +1 408.962.5487 (phone) +1 408.210.3569 (mobile)
    -------------------------------------------------
    Internationalization is an architecture.
    It is not a feature.

    Chair, W3C-I18N-WG Web Services Task Force
    To participate see http://www.w3.org/International/ws

    > -----Original Message-----
    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    > Behalf Of Peter_Constable@sil.org
    > Sent: Friday, May 16, 2003 1:43 PM
    > To: unicode@unicode.org
    > Subject: Re: Unicode conformant character encodings and us-ascii
    >
    >
    >
    > Philippe Verdy wrote on 05/15/2003 11:08:19 AM:
    >
    >
    > > Don't forget other Unicode encoding forms: UTF-7, BOCU and SCSU...
    >
    > These might be considered encoding forms, and they might be able to encode
    > the Unicode coded character set, but I don't think these should be called
    > "Unicode encoding forms". There are exactly three Unicode encoding forms:
    > UTF-8, UTF-16 and UTF-32.
    >
    >
    > > Unicode only defines codepoints, not their serialization into code
    > > units and not technical aspect such as byte order
    >
    > Not true. Issues of code units, byte order and serialization are not
    > relevant in relation to the Unicode coded character set, but the Unicode
    > Standard does include specifications for three encoding forms and seven
    > encoding schemes in which these things are most definitely defined.
    >
    >
    >
    > - Peter
    >
    >
    > ------------------------------------------------------------------
    > ---------
    > Peter Constable
    >
    > Non-Roman Script Initiative, SIL International
    > 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
    > Tel: +1 972 708 7485
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Fri May 16 2003 - 15:31:45 EDT