Re: RE: Roundtripping in Unicode

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Dec 13 2004 - 23:21:50 CST

Next message: Doug Ewell: "Re: Subj: Displaying Chinese characters and Chu Nom characters"

Previous message: Philippe Verdy: "Re: Roundtripping in Unicode"
In reply to: Philippe VERDY: "Re: RE: Roundtripping in Unicode"
Next in thread: John Cowan: "Re: RE: Roundtripping in Unicode"
Reply: John Cowan: "Re: RE: Roundtripping in Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Philippe VERDY wrote:

> (In fact I also think that mapping invalid sequences to U+FFFD is also
> an error, because U+FFFD is valid, and the presence of the encoding
> error in the source is lost, and will not throw exceptions in further
> processings of the remapped text, unless the application constantly
> checks for the presence of U+FFFD in the text stream, and all modules
> in the application explicitly forbids U+FFFD within its interface...)

Mapping invalid sequences to U+FFFD is explicitly permitted by
conformance clause C12a (TUS 4.0, p. 61):

"When faced with [an] ill-formed code unit sequence while transforming
or interpreting text, a conformant process must treat the first code
unit... as an illegally terminated code unit sequence -- for example, by
signaling an error, filtering the code unit out, or representing the
code unit with a marker such as U+FFFD REPLACEMENT CHARACTER."

Of course, any subsequent process that handles this text would have to
understand this convention, and not choke if handed a U+FFFD.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/

Next message: Doug Ewell: "Re: Subj: Displaying Chinese characters and Chu Nom characters"
Previous message: Philippe Verdy: "Re: Roundtripping in Unicode"
In reply to: Philippe VERDY: "Re: RE: Roundtripping in Unicode"
Next in thread: John Cowan: "Re: RE: Roundtripping in Unicode"
Reply: John Cowan: "Re: RE: Roundtripping in Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Dec 13 2004 - 23:23:56 CST