Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Mar 03 2003 - 15:17:14 EST

  • Next message: Mark Davis: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"

    At 11:52 AM 3/3/03 -0800, Mark Davis wrote:
    >Perhaps I wasn't clear; I agree with you on that.
    >
    >1) It is conformant to skip or substitute text, with just a code at the end
    >indicating that something of that sort was done.

    It's a subtle point, but can be put into your formulation:

    What I was after is where the "substitution" itself isn't legal Unicode,
    i.e. an unpaired surrogate in UTF-32. My take is that, formally speaking,
    as long as there's an indication of an error condition, I'm free to put
    anything into the output buffer, even malformed Unicode, and still be
    conformant.

    >2) Or, if someone wants more flexibility, to stop at possible errors, and
    >give the client of the API information so that they can do more complex
    >processing.
    >
    >Mark



    This archive was generated by hypermail 2.1.5 : Mon Mar 03 2003 - 15:38:37 EST