Encoding Scripts from the Past: Conceptual and Practical Problems and Solutions

Carl-Martin Bunz - University of Saarland

Intended Audience: Manager, Software Engineer, Marketer, Standardizer
Session Level: Beginner, Intermediate, Advanced

My paper "Scripts from the Past in Future Versions of Unicode(R)" delivered on IUC 16 (slot B5) was received with interest so that a subsequent tutorial appears to be useful in order to deal more explicitly with the problems involved when historic scripts are to be prepared for a standardized encoding. While my previous talk focussed on the classification of the scripts according to both user interest and encodability, my tutorial (4 hours) is to cover, in the first place, the systematic differences between the encoding processes of current and historic scripts respectively. By introducing palaeography and a palaeographic database the fundamental split becomes salient. Second, repertoires of historic writing symbols necessitate a new look on the notion of script Unicode adheres to. I intend to point out the main difficulties different script concepts involve, depending on different levels of abstraction. In the view of ISO and Unicode as well as the scholarly community, compromising between scientific treatment, engineering and marketing, seems to be the most viable method to follow in order to find practicable solutions.Third, the Unicode compliant definitions of Character and Glyph must be checked against the situation of historic script data. All this will be illustrated by numerous examples from various writing traditions. In a second part, the results of part 1 will be analysed with a view to ranks of encodability. Designing a roadmap for the inclusion of historic scripts in Unicode, however, cannot be done with regard to encodability only, but has to take into account user interests as well, which are to be reviewed at this point. In conclusion, the tutorial will synthesize the two rankings in order to elaborate a sound strategy of how to approach the encoding of the historic scripts of the world.

