Unicode and Arabic Studies
Thomas Milo - DecoType

Intended Audience: Software Engineers, Systems Analysts, Technical Writers, Web Administrators, Web Designers

Session Level: Beginner, Intermediate, Advanced

Until recently, Arabic text representation was the exclusive domain of professional calligraphers and typographers. Today it revolves around elusive computer codes and ugly fonts. Yet, scholars are expected to be able to handle literary text, archaic text as well as contemporary Qur'anic text with so-called word processors. The industry attempts to cater for such requirements, but it must do so practically without participation or professional input of academic specialists. Consequently the potential of philological computing, in fields like database research, networking and publishing, remains largely untapped.

For the creation of a complete model for handling Arabic script with information technology, exhaustive understanding of its structure is imperative. Creating such such a model involves linguistically sound computer-aided transcription for efficient data entry on the one hand and historically correct script images as professional output on the other. This is the kind of exercise where one cannot afford to take anything for granted regarding Arabic text representation.

This approach forces one to explore the opportunities of Unicode-based information technology for Arabic philology. While addressing key issues of Arabic computing, this paper takes the requirements of Qur'anic studies as the central theme: computer-aided transcription to input a clean data structure related to graphemes and archigraphemes as well as correctly shaped typography that incorporates precise rules for allographic assimilation.

The paper is based on the results of research into two faces of Arabic text: computer-aided Latin transcription and computer-synthesized Arabic script.

The technology under scrutiny creates the conditions for contrastive analysis of digital Arabic text and computer-synthesized calligraphy. This reveals unexpected relations between calligraphy, spelling and possibly even text history.