Re: New Charakter Proposal

From: Markus Scherer (
Date: Wed Oct 30 2002 - 12:39:23 EST

  • Next message: Marco Cimarosti: "RE: Character identities"

    Dominikus Scherkl wrote:
    > My other suggestion (and the main reason to call the proposed
    > charakter "source failure indicator symbol" (SFIS)) was intended
    > especaly for mall-formed utf-8 input that has overlong encodings.
    > In this special case a converter exactly knows which char is
    > intended, but needs to put out an error to avoid ambiguities.
    > In this case by now it MUST replace the overlong char by U+FFFD
    > (or even cancel the conversion!).
    > But I think SFIS + intended-char is a far better approach,
    > because it
    > 1) warns the reader AND keeps the text readable
    > 2) distinguish overlong encodings from illegal char sequenzes.

    This is a special, custom form of error handling - why assign a character for it?

    You could just use an existing character or non-character for this, e.g., U+303E or U+FFFF or U+FDEF
    or similar.


    Opinions expressed here may not reflect my company's positions unless otherwise noted.

    This archive was generated by hypermail 2.1.5 : Wed Oct 30 2002 - 13:16:29 EST