Spelling Correctors to Improve Production and
Diffusion of Linguistic Knowledge
Chantal Enguehard (Panelist) - University of Nantes

Intended Audience: Software Engineers, African Linguists, Ethnolinguists

Session Level: Intermediate

The African languages are not very present on Internet, however some are spoken by a significant population. The emergence of the Unicode standard makes possible to produce and post electronic texts in these languages, but such texts are still rare because the population does not have the competences to write its own language: teaching is summary (most of the population leaves the school after the primary education cycle), and the linguistic resources are quasi-non-existent.

The use of spelling correctors adapted to the African situation can contribute to improve this situation by ensuring the double role of checking texts (like the usual spelling correctors), and of dissemination of linguistic information. It is possible to produce such correctors by gathering and compiling the linguistic resources existing in the institutions and by using robust algorithms.

The existence such spelling correctors would contribute to insert these languages in the most modern technologies, and would represent an unquestionable encouragement to produce, read and exchange texts. The production of linguistic resources remains however a problem in countries having limited resources. The exploitation of existing corpora, their examination using adapted systems can considerably help the work of the linguists.

This process is currently tested on will bambara (one of the languages of Mali) in collaboration with linguists of CNR-ENF (Research National Center in Non Formal Education).