Re: Frequent incorrect guesses by the charset autodetection in IE7

From: Philippe Verdy (
Date: Sun Jul 16 2006 - 08:28:36 CDT

  • Next message: Richard Wordingham: "Re: Frequent incorrect guesses by the charset autodetection in IE7"

    From: "James Kass" <>
    > Quoting from
    > "... it is desirable to register the proposed standard code-set TSCII
    > as an international ISO standard."
    > My understanding is that efforts along those lines failed.
    > But, the user communities did attempt to propagate TSCII. For
    > example, Pango included support for TSCII in its first public
    > version.

    China was more successful when requiring the support of GB18030 and finally getting the support from most large OS vendors. China did not require the standardization by ISO to make it an international standard, as it gained support only from its Chinese-speaking community; well, there was a legacy support for GB2312 (prior to Unicode and iSO10646), and some wellknown vendor extensions that were finally integrated in GB18030.

    Thailand had TIS-620 long before Unicode and ISO, and could develop and impose it without asking for international support at ISO.

    Both examples could not fit the ISO8859 encoding model (made mostly for simple alphabets or abjads) although they were compatible with ISO 646 (not ISO 646/US, alias US-ASCII), but ISO could have adopted them as separate standards if these countries had requested it, and accepted to provide the fair licencing policy and implementation support for its use worldwide.

    These countries may have just thought that this international support was not needed, as they had too few exchange of data with the rest of the world, and thought hey could remain an autoritative source for these standards, without having to support the additional cost (in time and money) of an international maintenance.

    So the existence of an ISO standard is not required to support any language, as a de facto industry standard can be reached in a local area and then extended by contamination. What is more important here is the support by users and their involvement in the evolution. But sooner or later, even those legacy 8 bit encoding national standards will become deprecated as ISO 10646 and Unicode are gaining new fields there, for better interoperability; this is already true due to Internet technologies, and the choice by the IETF and IEC to prefer the ISO 10646 supported UTFs in all new protocols, each time it's possible and valuable to get localization.

    This archive was generated by hypermail 2.1.5 : Sun Jul 16 2006 - 08:37:24 CDT