Re: FYI: Google blog on Unicode

From: Doug Ewell (doug@ewellic.org)
Date: Mon Feb 08 2010 - 20:56:51 CST

  • Next message: Michael D'Errico: "GB18030 (was Re: FYI: Google blog on Unicode)"

    Never mind:

    > 2. Use charset detection, which uses a number of signals. The primary
    > signal is a statistical analysis of the bytes in the document, but the
    > charset tagging is taken into account (and can sometimes make a
    > difference).

    I was blind and missed the part where option 2 does take the tagging
    into account. Then again, if that is working correctly, how does
    properly tagged 8859-2 get represented as 8859-1?

    --
    Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
    RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­
    


    This archive was generated by hypermail 2.1.5 : Mon Feb 08 2010 - 20:59:51 CST