Re: Stateful encoding mechanisms

From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Fri May 20 2005 - 17:10:15 CDT

  • Next message: Dean Snyder: "Re: Stateful encoding mechanisms"

    Dean Snyder <dean.snyder@jhu.edu> writes:

    > If <0xD800 0xDF02> is interpreted differently than <0xD801 0xDF02>,
    > then the high surrogate is altering the interpretation of 0xDF02,
    > the low surrogate. I assert that that is stateful in the context of
    > discussing fragment fragility.

    It's much easier tractable kind of statefulness. No matter how they
    are called, they should be distinguished.

    In UTF-16 for each boundary between characters you can find the
    corresponding boundary in the encoded text, and the fragments can be
    physically put together in a different order, as far as surrogates are
    concerned (but not wrt. a BOM). This applies to UTF-8 too of course.

    This is not true for ISO-2022.

    -- 
       __("<         Marcin Kowalczyk
       \__/       qrczak@knm.org.pl
        ^^     http://qrnik.knm.org.pl/~qrczak/
    


    This archive was generated by hypermail 2.1.5 : Fri May 20 2005 - 17:11:22 CDT