Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8 from Alastair Houghton via Unicode on 2017-05-23 (Unicode Mail List Archive)

From: Alastair Houghton via Unicode <unicode_at_unicode.org>
Date: Tue, 23 May 2017 19:09:33 +0100

> On 23 May 2017, at 18:45, Markus Scherer via Unicode <unicode_at_unicode.org> wrote:
>
> On Tue, May 23, 2017 at 7:05 AM, Asmus Freytag via Unicode <unicode_at_unicode.org> wrote:
>> So, if the proposal for Unicode really was more of a "feels right" and not a "deviate at your peril" situation (or necessary escape hatch), then we are better off not making a RECOMMEDATION that goes against collective practice.
>
> I think the standard is quite clear about this:
>
> Although a UTF-8 conversion process is required to never consume well-formed subsequences as part of its error handling for ill-formed subsequences, such a process is not otherwise constrained in how it deals with any ill-formed subsequence itself. An ill-formed subsequence consisting of more than one code unit could be treated as a single error or as multiple errors.

Agreed. That paragraph is entirely clear.

Kind regards,

Alastair.

--
http://alastairs-place.net

Received on Tue May 23 2017 - 13:09:55 CDT

This archive was generated by hypermail 2.2.0 : Tue May 23 2017 - 13:09:55 CDT