Scripts from the Past in Future Versions of Unicode (R)

Carl-Martin Bunz - Institute of Comparative Linguistics, Johann Wolfgang Goethe-University

Intended Audience: Software Engineer, Marketer, Standardizer
Session Level: Beginner, Intermediate, Advanced

At the beginning of the new millenium, Unicode presents itself in Version 3.0 that encompasses not only the major scripts of the world, but already shows the first steps towards a more comprehensive encoding. Efforts are being made to include historic scripts in Unicode. During the last years several encoding proposals have been put forward, preparing historic scripts for encoding according to the Unicode principles, but without taking into account the results of the linguistic and philological research done in the course of the past century. In most cases, the proposals are based on manuals which no longer reflect the current knowledge. On the other hand, one has to remember that Unicode wishes to address groups of layman users as well who play with historic scripts, introducing like this, in the domain of funware, a commercial relevance of scripts normally not involved in real business or administrative communication.

Up to this day, the academic world has not participated very actively in these encoding efforts. Often this is not due to a lack of interest or to a scholarly inertia, but originates from motives of principal and methodical nature. Certain historic scripts are unencodable for structural reasons, a standardized character encoding being pointless for scientific work. Other scripts still have to be investigated before any judgement can be made upon their encodability.

The present paper sketches priorities for future encoding of historic scripts in Unicode, based upon a ranking with respect to encodability. The strategies proposed should prove that the interest of both the layman and the academic user can be satisfied.

