RE: Fun with proof by analogy, was Re: Mojibake on my Web pages

From: Francois Yergeau (FYergeau@alis.com)
Date: Mon Sep 29 2003 - 13:21:57 EDT

  • Next message: Rick McGowan: "Re: RE: Fun with proof by analogy, was Re: Mojibake on my Web pages"

    Jill Ramonsky wrote:
    > First point - if no information is present, assume "us-ascii".
    > Sounds extremely sensible to me.

    Sounds very misguided to me.

    > ASCII is the intersection of Latin-1, UTF-8, and various other
    > commonly used encodings.

    How does that make it more likely that guessing ASCII would be correct?

    > Moreover, in order to even read the name of the encoding, the
    > name of the encoding must have itself been encoded in something.

    See Appendix F of the XML spec for how you can do much better than assuming
    ASCII to read the encoding name.

    > It makes sense to me to assume the absolute minimum. If you want
    > more than the minimum, declare your encoding. This should not be
    > a problem.

    It makes much more sense to me to assume UTF-8, as XML does. If you want
    *less* than that, declare your encoding. This is not a problem.

    -- 
    François
    


    This archive was generated by hypermail 2.1.5 : Mon Sep 29 2003 - 14:22:48 EDT