Re: Unicode FAQ addendum

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Wed Jul 19 2000 - 13:55:10 EDT


John Cowan wrote:
> The new Unicode FAQ (like the old) supplies the panting world with
> John's Own Version of Unicode Conformance:

some of the old ones seem to be pre-unicode 1.1. should they not be updated?

> 1) Unicode code units are 16 bits long; deal with it.

this is true for the default encoding form, but not for utf-8 or utf-32. it is also misleading for utf-16; yes, the code units are 16 bits long (or wide), but you need to know that sometimes you need two for a code point. the above sounds like a statement from ucs-2 days.

> 4) Loose surrogates don't mean jack.

this needs some explanation - they are illegal sequences, but should be passed through for interoperability (i think that is what the book says). by itself, it is also misleading.

markus



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT