Re: What does it mean to "not be a valid string in Unicode"?

From: Martin J. Dürst <>
Date: Tue, 08 Jan 2013 09:59:04 +0900

On 2013/01/08 3:27, Markus Scherer wrote:

> Also, we commonly read code points from 16-bit Unicode strings, and
> unpaired surrogates are returned as themselves and treated as such (e.g.,
> in collation). That would not be well-formed UTF-16, but it's generally
> harmless in text processing.

Things like this are called "garbage in, garbage-out" (GIGO). It may be
harmless, or it may hurt you later.

Regards, Martin.
Received on Mon Jan 07 2013 - 19:01:09 CST

This archive was generated by hypermail 2.2.0 : Mon Jan 07 2013 - 19:01:09 CST