Re: What to backup after corruption of code units?

From: Eli Zaretskii <eliz_at_gnu.org>
Date: Wed, 28 Aug 2013 07:21:33 +0300

> Date: Wed, 28 Aug 2013 09:36:53 +0800
> From: Xue Fuqiao <xfq.free_at_gmail.com>
>
> For example, when randomly accessing a string, a program can find the
> boundary of a character with limited backup. In UTF-16, if a pointer
> points to a leading surrogate, a single backup is required. In UTF-8,
> if a pointer points to a byte starting with 10xxxxxx (in binary), one
> to three backups are required to find the beginning of the character.
>
> What does the "backup" mean here? What does the program backup?
>
> I searched "backup" with unicode.org/search/ but didn't get anything
> that looked promising. Can anyone point me in the right direction?

It means the program needs to go back (a.k.a. "back up") several bytes
to find the leading byte of the multibyte character sequence.
Received on Tue Aug 27 2013 - 23:23:37 CDT

This archive was generated by hypermail 2.2.0 : Tue Aug 27 2013 - 23:23:37 CDT