Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Markus Scherer via Unicode <unicode_at_unicode.org>
Date: Tue, 23 May 2017 10:45:46 -0700

On Tue, May 23, 2017 at 7:05 AM, Asmus Freytag via Unicode <
unicode_at_unicode.org> wrote:

> So, if the proposal for Unicode really was more of a "feels right" and not
> a "deviate at your peril" situation (or necessary escape hatch), then we
> are better off not making a RECOMMEDATION that goes against collective
> practice.
>

I think the standard is quite clear about this:

Although a UTF-8 conversion process is required to never consume
well-formed subsequences as part of its error handling for ill-formed
subsequences, such a process is not otherwise constrained in how it deals
with any ill-formed subsequence itself. An ill-formed subsequence
consisting of more than one code unit could be treated as a single error or
as multiple errors.

markus
Received on Tue May 23 2017 - 12:46:16 CDT

This archive was generated by hypermail 2.2.0 : Tue May 23 2017 - 12:46:17 CDT