RE: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Doug Ewell via Unicode <unicode_at_unicode.org>
Date: Tue, 30 May 2017 13:30:56 -0700

L2/17-168 says:

"For UTF-8, recommend evaluating maximal subsequences based on the
original structural definition of UTF-8, without ever restricting trail
bytes to less than 80..BF. For example: <C0 AF> is a single maximal
subsequence because C0 was originally a lead byte for two-byte
sequences."

When was it ever true that C0 was a valid lead byte? And what does that
have to do with (not) restricting trail bytes?

--
Doug Ewell | Thornton, CO, US | ewellic.org

Received on Tue May 30 2017 - 15:31:57 CDT

This archive was generated by hypermail 2.2.0 : Tue May 30 2017 - 15:31:57 CDT