Re: Ill-formed sequences (was: Re: UTF-16 inside UTF-8)

From: Doug Ewell (dewell@adelphia.net)
Date: Wed Nov 05 2003 - 14:03:15 EST

  • Next message: Philippe Verdy: "Re: [hebrew] Re: Hebrew composition model, with cantillation marks"

    Addison Phillips [wM] <aphillips at webmethods dot com> wrote:

    >> I assume that by “multiple UTF-8 sequences that could represent the
    >> same logical text,” Adobe is referring to non-shortest UTF-8
    >> sequences such as <C0 80> and not to Unicode canonical equivalences
    >> or something else. No similar warning about “multiple sequences” is
    >> given in the sections that deal with UTF-16.
    >
    > I am under the impression that they mean combining sequences.

    That would be good, but then I wonder why they phrased the issue in
    terms of UTF-8 rather than Unicode generally.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Wed Nov 05 2003 - 18:21:56 EST