Re: 32'nd bit & UTF-8

From: Philippe Verdy (
Date: Mon Jan 17 2005 - 20:31:22 CST

  • Next message: Philippe Verdy: "Re: ISO 15924 update"

    From: "Hans Aberg" <>
    >>The old RFC you're refering to is not designating UTF-8, but UTF-BSS,
    >>which is
    >>a transformation format,
    > OK. Fine, so we have a name for it.

    I was not sure about the name of it when writing the message.

    In fact we were speaking here about the RFC2044 (published in 1996) which
    was only informational and not standard, then obsoleted in 1998 by the
    *draft proposed* standard RFC2279, itself obsoleted by RFC3636 (the standard
    number STD0063, conforming to Unicode and ISO/IEC10646, and approved in

    RFC2044 was making a bibliographic informative reference to:

    - [FSS_UTF] X/Open CAE Specification C501 ISBN 1-85912-082-2 28cm.
                      22p. pbk. 172g. 4/95, X/Open Company Ltd., "File Sys-
                      tem Safe UCS Transformation Format (FSS_UTF)", X/Open
                      Preleminary Specification, Document Number P316. Also
                      published in Unicode Technical Report #4.
    which itself refered to UTR#4 which was still not a standard itself,

    - and to the 1993 version of ISO/IEC 10646-1

    (however ISO/IEC 10646-1:1993 made no standard reference to X/Open's
    FSS_UTF, which was only "described" in an adopted, but unpublished annex!).
    Formally, there was no standard UTF in either Unicode or in ISO/IEC 10646-1.

    So this was only a documentation about an existing implementation, needed
    then to allow encapsulation of Unicode *or* ISO/IEC 10646 within MIME (at
    that time the two standards were not joined and had mutual
    incompatibilities: in their encoding, in the supported repertoire and in the
    character model; this was only Unicode 1.0, and even at that time Unicode
    had not assigned any codepoint out of the BMP, so at least there, there was
    no internal Unicode compatiblity issue about the change in the definition of

    The problem comes more serious with RFC2279 which was pushished with Unicode
    2.0; however no characters were adopted out of the BMP.

    So if you want a normative RFC for UTF-8, refer only to RFC3636, or as
    suggested in this RFC, read the Unicode standard in the conformance section.

    This archive was generated by hypermail 2.1.5 : Mon Jan 17 2005 - 22:12:18 CST