Re: Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Nov 08 2004 - 19:17:47 CST

  • Next message: Asmus Freytag: "Re: The aim of Unicode"

    Tay, William wrote:
    > Is there any C library available that converts the decomposed UTF-8 byte
    > streams into the pre-composed equivalent?

    MacOS X does decompose filenames, but it does not use standard Unicode normalization (because it was
    designed before Unicode's normalization was finalized.) I suggest you search the mailing list
    archive for this list for more details. You probably need to use a MacOS system function.

    ICU has options for normalization (some defined with internal constants only) which may or may not
    match, or get close to, MacOS filename normalization: http://oss.software.ibm.com/cgi-bin/icu/nbrowser

    markus



    This archive was generated by hypermail 2.1.5 : Mon Nov 08 2004 - 19:22:37 CST