Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

From: Mark Davis (mark.davis@jtcsv.com)
Date: Mon Mar 03 2003 - 16:07:23 EST

Previous message: Yung-Fong Tang: "Re: Unicode Arabic Rendering Problem"
In reply to: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Next in thread: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Reply: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> anything into the output buffer, even malformed Unicode, and still be

If your converter purports to produce any one of the Unicode encoding forms,
then it cannot conformantly produce malformed Unicode as a result.

If, of course, it does not purport to do that, it can do anything it wants
to.

Mark
________
mark.davis@jtcsv.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799

----- Original Message -----
From: "Asmus Freytag" <asmusf@ix.netcom.com>
To: "Mark Davis" <mark.davis@jtcsv.com>; "Kent Karlsson"
<kentk@md.chalmers.se>; "'Michael (michka) Kaplan'" <michka@trigeminal.com>
Cc: "'Yung-Fong Tang'" <ftang@netscape.com>; <unicode@unicode.org>
Sent: Monday, March 03, 2003 12:17
Subject: Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for
review)

> At 11:52 AM 3/3/03 -0800, Mark Davis wrote:
> >Perhaps I wasn't clear; I agree with you on that.
> >
> >1) It is conformant to skip or substitute text, with just a code at the
end
> >indicating that something of that sort was done.
>
> It's a subtle point, but can be put into your formulation:
>
> What I was after is where the "substitution" itself isn't legal Unicode,
> i.e. an unpaired surrogate in UTF-32. My take is that, formally speaking,
> as long as there's an indication of an error condition, I'm free to put
> anything into the output buffer, even malformed Unicode, and still be
> conformant.
>
> >2) Or, if someone wants more flexibility, to stop at possible errors, and
> >give the client of the API information so that they can do more complex
> >processing.
> >
> >Mark
>
>
>

Next message: Mijan: "(no subject)"
Previous message: Yung-Fong Tang: "Re: Unicode Arabic Rendering Problem"
In reply to: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Next in thread: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Reply: Asmus Freytag: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Mar 03 2003 - 16:41:16 EST