Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8 from Karl Williamson via Unicode on 2017-05-23 (Unicode Mail List Archive)

From: Karl Williamson via Unicode <unicode_at_unicode.org>
Date: Tue, 23 May 2017 14:57:24 -0600

On 05/23/2017 12:20 PM, Asmus Freytag (c) via Unicode wrote:
> On 5/23/2017 10:45 AM, Markus Scherer wrote:
>> On Tue, May 23, 2017 at 7:05 AM, Asmus Freytag via Unicode
>> <unicode_at_unicode.org <mailto:unicode_at_unicode.org>> wrote:
>>
>> So, if the proposal for Unicode really was more of a "feels right"
>> and not a "deviate at your peril" situation (or necessary escape
>> hatch), then we are better off not making a RECOMMEDATION that
>> goes against collective practice.
>>
>>
>> I think the standard is quite clear about this:
>>
>> Although a UTF-8 conversion process is required to never consume
>> well-formed subsequences as part of its error handling for
>> ill-formed subsequences, such a process is not otherwise
>> constrained in how it deals with any ill-formed subsequence
>> itself. An ill-formed subsequence consisting of more than one code
>> unit could be treated as a single error or as multiple errors.
>>
>>
> And why add a recommendation that changes that from completely up to the
> implementation (or groups of implementations) to something where one way
> of doing it now has to justify itself?
>
> If the thread has made one thing clear is that there's no consensus in
> the wider community that one approach is obviously better. When it comes
> to ill-formed sequences, all bets are off. Simple as that.
>
> Adding a "recommendation" this late in the game is just bad standards
> policy.
>
> A./
>
>

Unless I misunderstand, you are missing the point. There is already a
recommendation listed in TUS, and that recommendation appears to have
been added without much thought. There is no proposal to add a
recommendation "this late in the game".
Received on Tue May 23 2017 - 15:57:45 CDT

This archive was generated by hypermail 2.2.0 : Tue May 23 2017 - 15:57:45 CDT