Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc

From: Theodore H. Smith (delete@elfdata.com)
Date: Sat May 03 2003 - 11:50:13 EDT

Next message: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."

Previous message: William Overington: "Transcribing old documents into Unicode compatible document files."
Next in thread: Addison Phillips [wM]: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Addison Phillips [wM]: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Carl W. Brown: "RE: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Ben Dougall: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hi list,

I need to implement some way to implement toUpper(), toFold(),
normalisation, collation, and perhaps other Unicode features I may have
missed out, on UTF8 strings stored in the RAM.

I need to implement it for Windows (32-bit), MacOS9 and MacOSX.

I have other Unicode processing code, already, but not these or
anything close to these.

I heard that the only way is to read out the character information from
a database? My whole string processing library, with hundreds of
functions and a few properties, is only 54k. I don't want to add 200k
of database reading code and then huge Unicode database files to this
54k.

How is this best done, then? I'm assuming there isn't any mathematical
way to figure out a codepoint's properties? So where do I get this data
and what's the fastest way to do it?

--
     Theodore H. Smith - Macintosh Consultant / Contractor.
     My website: <www.elfdata.com/>

Next message: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."
Previous message: William Overington: "Transcribing old documents into Unicode compatible document files."
Next in thread: Addison Phillips [wM]: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Addison Phillips [wM]: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Carl W. Brown: "RE: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Reply: Ben Dougall: "Re: Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat May 03 2003 - 12:51:29 EDT