Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Asmus Freytag via Unicode <>
Date: Tue, 23 May 2017 07:05:04 -0700
On 5/23/2017 1:24 AM, Martin J. Dürst via Unicode wrote:
Hello Mark,

On 2017/05/22 01:37, Mark Davis ☕️ via Unicode wrote:
I actually didn't see any of this discussion until today.

Many thanks for chiming in.

( mail was going into my spam folder...) I started
reading the thread, but it looks like a lot of it is OT,

As is quite usual on mailing list :-(.

so just scanned
some of them.

A few brief points:

   1. There is plenty of time for public comment, since it was
targeted at *Unicode
   11*, the release for about a year from now, *not* *Unicode 10*, due this
   2. When the UTC "approves a change", that change is subject to comment,
   and the UTC can always reverse or modify its approval up until the meeting
   before release date. *So there are ca. 9 months in which to comment.*

This is good to hear. What's the best way to submit such comments?

   3. The modified text is a set of guidelines, not requirements. So no
   conformance clause is being changed.
   - If people really believed that the guidelines in that section should
      have been conformance clauses, they should have proposed that at
some point.

I may have missed something, but I think nobody actually proposed to change the recommendations into requirements. I think everybody understands that there are several ways to do things, and situations where one or the other is preferred. The only advantage of changing the current recommendations to requirements would be to make it more difficult for them to be changed.

In this context it's worth looking at other standards organization's use of "recommended", because that may explain a lot of people's unease with this. For example, IETF has RFC 2119 which says:


This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification. ...


This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. ..

5. MAY

This word, or the adjective "OPTIONAL", mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)

Reading this, it's clear that "RECOMMENDED" is not merely  a "we think this is the best way to do it" but a rather sterner "you deviate at your peril" kind of statement.

The latter is what makes it difficult for others to collectively agree on a different choice faced with a formal RECOMMENDATION.

So, if the proposal for Unicode really was more of a "feels right" and not a "deviate at your peril" situation (or necessary escape hatch), then we are better off not making a RECOMMEDATION that goes against collective practice.

I think the situation at hand is somewhat special: Recommendations are okay. But there's a strong wish from downstream communities such asWeb browser implementers and programming language/library implementers to not change these recommendations. Some of these communities have stricter requirement for alignment, and some have followed longstanding recommendations in the absence of specific arguments for something different.

Regards,   Martin.

      - And still can proposal that — as I said, there is plenty of time.


On Wed, May 17, 2017 at 10:41 PM, Doug Ewell via Unicode <> wrote:

Henri Sivonen wrote:

I find it shocking that the Unicode Consortium would change a
widely-implemented part of the standard (regardless of whether Unicode
itself officially designates it as a requirement or suggestion) on
such flimsy grounds.

I'd like to register my feedback that I believe changing the best
practices is wrong.

Perhaps surprisingly, it's already too late. UTC approved this change
the day after the proposal was written.

Doug Ewell | Thornton, CO, US |

Received on Tue May 23 2017 - 09:05:10 CDT

This archive was generated by hypermail 2.2.0 : Tue May 23 2017 - 09:05:10 CDT