A Library for Multilingual Text Processing
Naoto Takahashi - National Institute of Advanced Industrial Science and Technology

Intended Audience: Software Engineers

Session Level: Intermediate

We have developed a software library to multilingualize application programs on Unix/Linux systems. Our library, named "the m17n (=multilingualization) library", provides functions necessary to process multilingual characters and scripts. Currently the m17n library can display 43 (out of 50 supported in Unicode 4.0) scripts correctly and provides 41 input methods for various languages and scripts.

This paper describes the design of the m17n library and the database it uses, and shows some examples. The m17n library is a natural extension of the legacy C library and the X library, so it is easy to be learnt by developers.

The m17n library itself is language independent. Language dependent information is isolated in the m17n database, thus it is easy to add supports for new languages and scripts. The m17n library provides a huge character code space (0x0 .. 0x3FFFFF), whose first 1-bit space is allocated to the Unicode characters of the same code space.

The m17n library is released as open source software under the GNU Lesser General Public License and can be obtained from http://www.m17n.org/m17n-lib .Developers of internationalized or multilingualized applications can leave multilingual processing to our library and concentrate on the main purpose of their programs.

Developers who have to localize applications for minority languages can also benefit from the library.