Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Alastair Houghton via Unicode <>
Date: Tue, 16 May 2017 08:26:33 +0100

On 15 May 2017, at 23:43, Richard Wordingham via Unicode <> wrote:
> The problem with surrogates is inadequate testing. They're sufficiently
> rare for many users that it may be a long time before an error is
> discovered. It's not always obvious that code is designed for UCS-2
> rather than UTF-16.

While I don’t think we should spend too long debating the relative merits of UTF-8 versus UTF-16, I’ll note that that argument applies equally to both combining characters and indeed the underlying UTF-8 encoding in the first place, and that mistakes in handling both are not exactly uncommon. There are advantages to UTF-8 and advantages to UTF-16.

Kind regards,


Received on Tue May 16 2017 - 02:26:48 CDT

This archive was generated by hypermail 2.2.0 : Tue May 16 2017 - 02:26:48 CDT