RE: internationalization assumption

From: Mike Ayers (
Date: Wed Oct 06 2004 - 16:51:53 CST

  • Next message: Philippe Verdy: "Re: internationalization assumption"

    > From:
    > [] On Behalf Of

    > There is 2 products for English version. One is coded by
    > UTF8 and the other is coded by NON-UTF8. Both products are
    > internationalized readiness.

            I interpret that last sentence to mean that both products are being
    tested to determine that they are internationalized (i.e. localization

    > Let's say. The test engineer ensures the functionality and
    > validates the input and output on major Latin 1 languages,
    > such as German, French, Spanish, Italian, as well as Korean,
    > Japanese, Chinese.
    > If those products handle all languages as addressed above,
    > could it be assumed that the entire character sets in whole
    > latin 1, Han, Cyrillic, Arabic.... can be certifed on both products???
    > Please make any comments on this assumption.

            The non-UTF-8 (more accurately, non-Unicode) application can only be
    certified for languages which are handled by whatever character set it uses.

            It is a common strategy to isolate the text processing routines and
    text in an application and change only those parts to support different
    character sets. This is common in pre-Unicode applications and applications
    which only need to support a very limited set of languages (e.g. English and
    Japanese). For such a strategy, each build must be tested completely for
    all supported character sets.

            For Unicode applications, Latin 1 testing is insufficient, even for
    internationalization testing. Internationalization tests should verify, at
    minimum, that characters >u1000 <=uffff (basically, all of the BMP) can be
    used. It is also good to verify >=u10000 support, or at least determine
    whether or not it exists for your application. I usually test English and
    Japanese for BMP conformance. For >BMP, while all the applications I've
    tested so far have specifically excluded this range, I still have a simple
    strategy based upon snipping the Deseret text from James Kass' script links
    page ( and using that
    (thanks, James!).

            Note that none of the above at all refers to localization testing,
    which still must be done for every supported language-charset combination
    (this is where Unicode can really pay off by reducing things to 1 charset
    per language). Internationalization testing should only determine the
    ability of your application to handle other languages, it is localization
    testing that determines whether it actually handles a given language, and
    would include such things as text entry and display, text conversion,
    coextistence, etc., as applicable.



    This archive was generated by hypermail 2.1.5 : Wed Oct 06 2004 - 16:57:51 CST