Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc

From: Theodore H. Smith (
Date: Sat May 03 2003 - 11:50:13 EDT

  • Next message: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."

    Hi list,

    I need to implement some way to implement toUpper(), toFold(),
    normalisation, collation, and perhaps other Unicode features I may have
    missed out, on UTF8 strings stored in the RAM.

    I need to implement it for Windows (32-bit), MacOS9 and MacOSX.

    I have other Unicode processing code, already, but not these or
    anything close to these.

    I heard that the only way is to read out the character information from
    a database? My whole string processing library, with hundreds of
    functions and a few properties, is only 54k. I don't want to add 200k
    of database reading code and then huge Unicode database files to this

    How is this best done, then? I'm assuming there isn't any mathematical
    way to figure out a codepoint's properties? So where do I get this data
    and what's the fastest way to do it?

         Theodore H. Smith - Macintosh Consultant / Contractor.
         My website: <>

    This archive was generated by hypermail 2.1.5 : Sat May 03 2003 - 12:51:29 EDT