Re: Stateful encoding mechanisms

From: Tim Greenwood (timothy.greenwood@gmail.com)
Date: Fri May 20 2005 - 12:24:52 CDT

  • Next message: Hans Aberg: "Re: ASCII and Unicode lifespan"

    On 5/19/05, Dean Snyder <dean.snyder@jhu.edu> wrote:
    > Well that, of course, depends on how you define state, acknowledgment of
    > which, I presume, is related to both your qualified dissension and your
    > use of quotes around the word "state" here.

    While I do not agree that your definition of state matches that
    commonly accepted, it is a coherent argument. However if you make that
    argument then you must address Ken's other point. You criticise the
    use of 'stateful' code units in UTF-16, yet do not do the same for
    UTF-8. Why not? The structure of both is very similar. In both a
    Unicode character is encoded by a sequence of one of more base code
    units. The only difference is that when interpreting individual code
    units (from the set that require greater than one to map to a
    character) as a number those from UTF8 have a corresponding Unicode
    character and those from UTF-16 do not. Would you prefer that the
    surrogate area had also been assigned individual characters? It would
    not change the model at all, just make processing less efficient, but
    on a par with UTF-8.

    Tim



    This archive was generated by hypermail 2.1.5 : Fri May 20 2005 - 12:25:51 CDT