Re: FYI: Google blog on Unicode

From: Doug Ewell (doug@ewellic.org)
Date: Tue Feb 09 2010 - 07:29:11 CST

  • Next message: Rick McGowan: "Re: Wall with Maya Seignior Glyphs Discovered at Archaeological Zone"

    "verdy_p" <verdy underscore p at wanadoo dot fr> wrote:

    > If the algorithm takes the ISO 8859-x tag unreliable because the page
    > contains some Windows 125x characters (in the code range 0x80-0x9F),
    > it is probably wrong: assume Windw 125x instead and use it as the
    > secondary indicator (after the statistic estimation euristic).

    That's what I said. Maybe if browsers followed this strategy anyway,
    the authors of HTML5 wouldn't have felt it necessary to demand it.

    Note, though, that there are only a few Windows single-byte code pages
    which differ from ISO 8859 counterparts only by adding glyphs in the C1
    range. The relationship between 1252 and 8859-1 is well known, and 1254
    and 8859-9 are like this as well, but most are not; 1250 and 8859-2, for
    example, have numerous other differences.

    --
    Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
    RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­
    


    This archive was generated by hypermail 2.1.5 : Tue Feb 09 2010 - 07:34:19 CST