From: Mark E. Shoulson (
Date: Fri Sep 07 2007 - 08:45:07 CDT

    Doug Ewell wrote:

    > I'll see if I can find the thread where we talked about that, years ago.
    > Somebody wanted to build that capability into an extension to UTF-8,
    > so it could faithfully represent invalid garbage. We were never able
    > to get him to work through what he wanted to do with the garbage thus
    > preserved.
    Is there an obvious reason we couldn't just treat the garbage UTF-8 as a
    string of 8-bit characters (might be part of a binary file or something)
    and base-64 encode them? That'll definitely preserve round-trippedness.


