Re: Interesting UTF-8 decoder

From: Mark Davis ☕️ via Unicode <unicode_at_unicode.org>
Date: Mon, 9 Oct 2017 13:16:03 +0200

The paper points out that the input buffer needs to be padded with 3 null
bytes as a precondition.

Mark <https://twitter.com/mark_e_davis>

On Mon, Oct 9, 2017 at 10:57 AM, J Decker via Unicode <unicode_at_unicode.org>
wrote:

> that's interesting; however it will segfault if the string ends on a
> memory allocation boundary. will have to make sure strings are always
> allocated with 3 extra bytes.
>
> 2017-10-09 1:37 GMT-07:00 Martin J. Dürst via Unicode <unicode_at_unicode.org
> >:
>
>> A friend of mine sent me a pointer to
>> http://nullprogram.com/blog/2017/10/06/, a branchless UTF-8 decoder.
>>
>> Regards, Martin.
>>
>
>
Received on Mon Oct 09 2017 - 06:16:50 CDT

This archive was generated by hypermail 2.2.0 : Mon Oct 09 2017 - 06:16:50 CDT