Re: Small Java implementation of NFC

From: Rick McGowan (rick@unicode.org)
Date: Wed Mar 09 2005 - 20:17:02 CST

  • Next message: Rick McGowan: "Re: Unicode abuse"

    A number of mail messages from Mark Davis were not distributed by this
    server due to a configuration problem. Attached below is a copy of one such
    message.

            Rick

    --- Below this line is a copy of the message.

    Date: Fri, 4 Mar 2005 16:00:17 -0800
    MIME-Version: 1.0

    It is indeed possible to have a small implementation of NFC; one can make a
    very small one by trading speed against size, as per
    http://www.macchiato.com/unicode/normalization_footprint.htm (the sizes are
    for an older version of Unicode, but wouldn't grow much).

    However, I have one concern; I think it was you who distributed a
    non-conformant BIDI implementation. Have you verified that your
    implementation below is conformant? In particular, does it pass the
    Normalization tests in the UCD, such as

    http://unicode.org/Public/4.0-Update/NormalizationTest-4.0.0.txt

    In particular, we have added tests for 4.0.1 that test cases exposed in PRI
    #29:

    http://unicode.org/Public/4.1.0/ucd/NormalizationTest-4.1.0d11.txt

    ‎Mark

    ----- Original Message -----
    From: "Mark Leisher" <mleisher@crl.NMSU.Edu>
    To: "Unicode List" <unicode@unicode.org>
    Sent: Thursday, March 03, 2005 07:32
    Subject: Re: Small Java implementation of NFC

    > Elliotte Harold wrote:
    > > Currently my Java library (XOM) is dragging along a hefty chunk (344K)
    > > of IBM's open source ICU just to support one rarely invoked method that
    > > converts strings into NFC. I'd like to get rid of that. Given the nature
    > > of my application it is more important to me to be able to eliminate the
    > > extra jar file and its size, than it is to have the fastest, most
    > > intelligent NFC algorithm.
    >
    > Check out http://crl.nmsu.edu/~mleisher/ucdata.html. I have a Java class
    that
    > provides composition, decomposition, character types and combining class
    info.
    > Compiled, the class is about 14K. With the four relevant compiled data
    files,
    > it comes to about 88K total.
    >
    > You would have to provide the NFC routine on top of this.
    > --
    > --------------------------------------------------------------------------
    -
    > Mark Leisher
    > Computing Research Lab            All political parties die at last of
    > New Mexico State University       swallowing their own lies.
    > Box 30001, MSC 3CRL                  -- John Arbuthnot (1667-1735)
    > Las Cruces, NM  88003
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 20:18:17 CST