Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc

From: Ben Dougall (bend@freenet.co.uk)
Date: Sat May 03 2003 - 13:54:57 EDT

  • Next message: John Hudson: "Re: Transcribing old documents into Unicode compatible document files."

    os x (and maybe os 9, not sure): see CFCharacterSet.h in the
    CoreFoundation framework, specifically CFCharacterSetPredefinedSet.

    On Saturday, May 3, 2003, at 04:50 pm, Theodore H. Smith wrote:

    > Hi list,
    >
    > I need to implement some way to implement toUpper(), toFold(),
    > normalisation, collation, and perhaps other Unicode features I may
    > have missed out, on UTF8 strings stored in the RAM.
    >
    > I need to implement it for Windows (32-bit), MacOS9 and MacOSX.
    >
    > I have other Unicode processing code, already, but not these or
    > anything close to these.
    >
    > I heard that the only way is to read out the character information
    > from a database? My whole string processing library, with hundreds of
    > functions and a few properties, is only 54k. I don't want to add 200k
    > of database reading code and then huge Unicode database files to this
    > 54k.
    >
    > How is this best done, then? I'm assuming there isn't any mathematical
    > way to figure out a codepoint's properties? So where do I get this
    > data and what's the fastest way to do it?
    >
    > --
    > Theodore H. Smith - Macintosh Consultant / Contractor.
    > My website: <www.elfdata.com/>
    >
    >



    This archive was generated by hypermail 2.1.5 : Sat May 03 2003 - 14:44:58 EDT