RE: Stateful encoding mechanisms

From: Peter Constable (petercon@microsoft.com)
Date: Thu May 19 2005 - 10:46:13 CDT

  • Next message: Patrick Andries: "Re: ASCII and Unicode lifespan"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of Dean Snyder

    > Stateful mechanisms that contribute to fragility at the character
    level -
    > Surrogates
    > BOM
    >
    > Stateful mechanisms that contribute to fragility above the character
    level -
    > Bidirectional Ordering Controls
    > Annotation characters

    Note that surrogates, BOM and annotation characters (FFF9..FFFB) are not
    used in the text content of a file:

    - BOM is a control used within certain encoding-scheme mechanisms

    - the surrogate *codepoints* are not assigned to characters; surrogate
    *code units* are used within a particular encoding-form mechanism

    - annotation characters are intended for process-internal usage (like
    the non-characters at FDD0..FDEF), not for interchange

    What you say about the bidi controls is entirely correct, though.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 10:47:06 CDT