Re: Unicode FAQ addendum

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Wed Jul 19 2000 - 13:55:10 EDT

Next message: Mark Davis: "Emoticons"
Previous message: Markus Scherer: "Re: Using unicode in a Java program"
Maybe in reply to: John Cowan: "Unicode FAQ addendum"
Next in thread: John Cowan: "Re: Unicode FAQ addendum"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

John Cowan wrote:
> The new Unicode FAQ (like the old) supplies the panting world with
> John's Own Version of Unicode Conformance:

some of the old ones seem to be pre-unicode 1.1. should they not be updated?

> 1) Unicode code units are 16 bits long; deal with it.

this is true for the default encoding form, but not for utf-8 or utf-32. it is also misleading for utf-16; yes, the code units are 16 bits long (or wide), but you need to know that sometimes you need two for a code point. the above sounds like a statement from ucs-2 days.

> 4) Loose surrogates don't mean jack.

this needs some explanation - they are illegal sequences, but should be passed through for interoperability (i think that is what the book says). by itself, it is also misleading.

markus

Next message: Mark Davis: "Emoticons"
Previous message: Markus Scherer: "Re: Using unicode in a Java program"
Maybe in reply to: John Cowan: "Unicode FAQ addendum"
Next in thread: John Cowan: "Re: Unicode FAQ addendum"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT