Re: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Nov 07 2003 - 19:32:59 EST

Next message: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"

Previous message: John Cowan: "Handy table of combining character classes"
In reply to: Doug Ewell: "Re: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Next in thread: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Reply: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

From: "Doug Ewell" <dewell@adelphia.net>

> Philippe Verdy wrote (in rich text):
>
> > Due to that, an application needs to specify whever it will support
> > and comply with the full ISO/IEC 10646-1:2000 character set or to the
> > Unicode subset.
>
> ISO/IEC 10646 has reduced its range to match Unicode's, so this
> distinction is obsolete.

It is not obsolete: the corrigendum #1 for UTF-8 (published in Unicode 4.0)
refers to ISO/IEC 10646-1:2000, not to ISO/IEC 10646:2003 which is the
character repertoire which corresponds to Unicode 4.0...

So that's a reference error in the version of the now normative corrigendum
published in Unicode 4.0...

Does it need another Corrigendum to correct this reference in the
Corrigendum?

Well, I still doubt that ISO/IEC 10646 has reduced its character set. It has
just agreed to limit its repertoire of _standardized_ and _interchangeable_
characters to the first 17 planes so that _these_ characters can remain in
sync and encoded identically in the Unicode repertoire with the same code
points, but all the other planes are still present in ISO/IEC 10646, some of
them being still allocated to PUAs that don't have equivalents in Unicode,
but they are still valid within UTF-8 encoded data and also still conforming
to ISO/IEC 10646 (even if they are illegal for use in Unicode 4.0, these
sequences are not ill-formed like non shortest forms now forbidden in both
standards).

Next message: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Previous message: John Cowan: "Handy table of combining character classes"
In reply to: Doug Ewell: "Re: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Next in thread: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Reply: Kent Karlsson: "RE: Corrigendum #1 (UTF-8 shortest form) wording: MIME, and software interfaces specifications"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Nov 07 2003 - 20:09:46 EST