Comparison algorithms in UNICODE

From: Patrik Faltstrom ([email protected])
Date: Sat Aug 12 1995 - 15:22:22 EDT


Within the development of the distributed directory service
software DIGGER which uses Whois++ technology we will now
start the development of public domain software libraries
written in C which takes care of fundamental string functions
such as:

- Optimization of strings

  Some characters in the UNICODE table can be written by
  using a different base character which is then followed by
  one or more composition characters. An example is the
  character 00C5 LATIN CAPITAL LETTER A WITH RING ABOVE
  which can be written as 0041+030A. The idea behind this
  function is to minimize the number of bytes in the
  UNICODE string by converting all occurances of 0041+030A
  into 00C5.

- Uppercase/Lowercase conversions

- Comparison routines

- Conversion to/from FSS-UTF and UNICODE

We will start doing this because I have not seen any public domain
libraries that do this so far.

If I am wrong, and such software libraries exists, please inform
me about it.

Also, I suppose that this is the right forum to discuss eventual
implementational issues of UNICODE software? Or does it exist another
mailinglist for that?

   Regards, Patrik F�ltstr�m
   Bunyip Information Systems Inc
   Montreal, CANADA



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:30 EDT