Re: Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Nov 09 2004 - 12:21:45 CST

  • Next message: Peter Kirk: "Re: About Encoding Theory (was: Re: Again not about Phoenician)"

    Deborah is right, and I am sorry. I did not read the email carefully.

    ICU works on Solaris and provides normalization APIs. As usual for processing, it works on UTF-16
    strings. ICU also has charset converters and dedicated functions like u_strFromUTF8() and u_strToUTF8().

    http://oss.software.ibm.com/icu/

    See also http://www.unicode.org/onlinedat/products.html#3

    markus

    Deborah Goldsmith wrote:
    > I think he's saying he wants to convert to NFC *from* Mac OS X data, in
    > which case the fact that Mac OS X's file system normalization is not
    > strict NFD doesn't really matter. Also, he says he's running on Solaris,
    > which would make it a tad difficult to call a Mac OS X API. ICU should
    > do the trick.
    >> Tay, William wrote:
    >>
    >>> Is there any C library available that converts the decomposed UTF-8 byte
    >>> streams into the pre-composed equivalent?



    This archive was generated by hypermail 2.1.5 : Tue Nov 09 2004 - 12:29:10 CST