Mon Feb 24 2003

    Frank Tang asked:

    > I am working on update the Mozilla UTF-8 code to incooperate the change
    > of UTF-8 definitation in Unicode 3.1 (make non-shortest form illegal,
    > and make 5-6 octets illegal) and Unicode 3.2 (make irregular form
    > illegal) now. I wonder do have any change of the UTF-8 definitation from
    > Unicode 3.2 to unicode 4.0? If we have, I would like to know that eariler.

    And the answer is no, there is no further change in the definition
    of UTF-8 from Unicode 3.2 to Unicode 4.0.

    There is considerable change to the text of the normative
    part of the standard, to systematically incorporate the changes
    to UTF-8, and to put UTF-8, UTF-16, and UTF-32 on an equal
    footing in the text, but there is no further substantive
    change in the definition of UTF-8 past the changes documented
    in UAX #28 for Unicode 3.2.

    In particular, all the legal UTF-8 byte sequences documented
    in Table 3.1B in Unicode 3.2 are incorporated exactly the same
    way in the corresponding table in Unicode 4.0.


