Re: Purpose of REPLACEMENT CHARACTER

From: John Cowan (cowan@locke.ccil.org)
Date: Sun Apr 11 1999 - 19:59:50 EDT

Next message: Masahiko Maedera: "[Proposal] Extended UTF-16 by using Plane 14"
Previous message: Mark Leisher: "Re: XLFD glyph ranges"
Maybe in reply to: Markus Kuhn: "Purpose of REPLACEMENT CHARACTER"
Next in thread: Markus Kuhn: "Re: Purpose of REPLACEMENT CHARACTER"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Markus Kuhn scripsit:

> If I implement a UTF-8 -> UCS-2 converter, what shall I do with
> malformed UTF-8 sequences? ISO 10646-1 in section 2.3c and section R.7
> clearly requires that malformed UTF-8 sequences are indicated to the
> user. Is replacing any malformed UTF-8 sequence by 0xFFFD appropriate
> use of this character? After all, a malformed UTF-8 sequence is in a
> sense something outside the range of Unicode.

The Plan 9 folks decided no, that an unknown character is not the same as
an invalid encoding which does not represent any character.
They map the latter into U+0080, an unused control character.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

Next message: Masahiko Maedera: "[Proposal] Extended UTF-16 by using Plane 14"
Previous message: Mark Leisher: "Re: XLFD glyph ranges"
Maybe in reply to: Markus Kuhn: "Purpose of REPLACEMENT CHARACTER"
Next in thread: Markus Kuhn: "Re: Purpose of REPLACEMENT CHARACTER"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT