Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

From: Addison Phillips [wM] (
Date: Tue Jul 15 2003 - 12:25:10 EDT

  • Next message: Addison Phillips [wM]: "Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)"

    Phillipe wrote:

    >>I hae tried several times to do it. It does not work: you may
    >>effectively remove some tables your don't need, but trying
    >>to extract just the normalizer is a real nightmare. I tried it
    >>in the past, and abondonned: too tricky to maintain, and I
    >>retried it recently (one month ago, from its CVS source) and
    >>this was even worse than the first time.

    webMethods includes the ICU normalizer in a couple of our products. The
    code for one of these products requires JDK 1.2.2, so, since I had to
    compile ICU anyway, I took the time to figure out the dependencies and
    build only what I needed.

    The list of classes required for the normalizer is actually quite small.
    Of the 1.3MB ICU4j.jar, only 400K are required for the normalizer to
    operate correctly. Source changes required. I will gladly send a
    complete list of classes to anyone who would like it. It took me a day
    to do the work (it took longer to test it than to build it).

    Adding the normalizer to the JDK itself would also not be a difficult
    thing for Sun to do: that's because a version of the normalizer is
    already in the JDK, but private.

    I will admit that it used to be quite difficult, back in the ICU 1.x
    days, to separate out the normalizer, but I've done that too (for
    reasons I shan't enumerate). I had to modify some source code to make it
    work, but that was mostly because I needed JDK 1.1.x. That JAR file is
    even smaller, at 161K. Building updated data tables is actually easier
    with the old source code...

    In any event, you really ought to try the newer versions of ICU4J out.
    They are a lot easier to work with. And a "light" version isn't that
    hard to create, if that's what you want.

    Best Regards,


    Addison P. Phillips
    Director, Globalization Architecture
    webMethods, Inc.
    +1 408.962.5487
    Internationalization is an architecture. It is not a feature.
    Chair, W3C I18N WG Web Services Task Force

    This archive was generated by hypermail 2.1.5 : Tue Jul 15 2003 - 13:21:35 EDT