Re: What does it mean to "not be a valid string in Unicode"?

From: Markus Scherer <markus.icu_at_gmail.com>
Date: Mon, 7 Jan 2013 10:27:40 -0800

Unicode libraries commonly provide functions that take a code point and
return a value, for example a property value. Such a function normally
accepts the whole range 0..10ffff (and may even return a default value for
out-of-range inputs).

Also, we commonly read code points from 16-bit Unicode strings, and
unpaired surrogates are returned as themselves and treated as such (e.g.,
in collation). That would not be well-formed UTF-16, but it's generally
harmless in text processing.

markus
Received on Mon Jan 07 2013 - 12:32:34 CST

This archive was generated by hypermail 2.2.0 : Mon Jan 07 2013 - 12:32:36 CST