What to backup after corruption of code units? from Xue Fuqiao on 2013-08-27 (Unicode Mail List Archive)

From: Xue Fuqiao <xfq.free_at_gmail.com>
Date: Wed, 28 Aug 2013 09:36:53 +0800

Hi list,

I'm reading Unicode 6.2.0 and have a question. In Section 2.5, Encoding Forms:

  For example, when randomly accessing a string, a program can find the
  boundary of a character with limited backup. In UTF-16, if a pointer
  points to a leading surrogate, a single backup is required. In UTF-8,
  if a pointer points to a byte starting with 10xxxxxx (in binary), one
  to three backups are required to find the beginning of the character.

What does the "backup" mean here? What does the program backup?

I searched "backup" with unicode.org/search/ but didn't get anything
that looked promising. Can anyone point me in the right direction?

(English is not my native language; please excuse typing errors.)

-- 
Best regards, Xue Fuqiao.

Received on Tue Aug 27 2013 - 22:53:57 CDT

This archive was generated by hypermail 2.2.0 : Tue Aug 27 2013 - 22:53:59 CDT