Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Mon, 15 May 2017 23:43:29 +0100

On Mon, 15 May 2017 21:38:26 +0000
David Starner via Unicode <unicode_at_unicode.org> wrote:

> > and the fact is that handling surrogates (which is what proponents
> > of UTF-8 or UCS-4 usually focus on) is no more complicated than
> > handling combining characters, which you have to do anyway.

> Not necessarily; you can legally process Unicode text without worrying
> about combining characters, whereas you cannot process UTF-16 without
> handling surrogates.

The problem with surrogates is inadequate testing. They're sufficiently
rare for many users that it may be a long time before an error is
discovered. It's not always obvious that code is designed for UCS-2
rather than UTF-16.

Richard.
Received on Mon May 15 2017 - 17:44:03 CDT

This archive was generated by hypermail 2.2.0 : Mon May 15 2017 - 17:44:04 CDT