L2/04-059 From: Karljürgen G. Feuerherm Subject: Short Response to L2/04-041 "Fitting Cuneiform Encoding to Cuneiform Script" Date: 30 January, 2004 Dear members of the UTC, The present write-up constitutes a short response to the recent submission by Lloyd Anderson, a document purporting to explain why the cuneiform encoding proposal N2698 submitted by Everson, Feuerherm, and Tinney on behalf of the assyriological and broader scholarly communities presently under consideration by the UTC refelects a flawed model whose acceptance would involve a serious disservice to the Assyriological community. While Anderson's tremendous investment in and contribution to the cuneiform encoding are both to be appreciated and respected, and his intentions are of the very best, L2/04-041 contains a number of inaccuracies directly affecting the validity of the argument, from which faulty conclusions are drawn. It may be relevant to draw attention to the fact that I am myself a practicing assyriologist, a specialist in the Old Babylonian dialect and script, as well as a former computer scientist, and therefore speak from an informed, rather than a lay viewpoint. Anderson's document is a complex one, both in terms of topical coverage and in terms of style. I do not propose to respond to everything contained therein, but to restrict myself to the fundamental premise and its development. Other issues, such as the comparison with Han and Latin, have already been refuted on the public list (cuneiform@unicode.org) and need not be rehashed here; further, any ambiguity on this score will, I expect, be addressed directly by Ken Whistler. Other points bearing on the derivation from the earliest history of cuneiform, I leave to Steve Tinney, a prominent assyriologist and specialist in early cuneiform, to address. The fundamental argument of Anderson's paper seems to be as follows: 1. Certain principles were agreed upon by "the encoding group" at an early stage, on the lines of following the traditional scribal and assyriological understanding of what are cuneiform signs in the context of their diachronic development. 2. At some later stage, the "encoding group" departed from these principles and went ahead with an encoding which did not conform to the agreed upon plan. 3. That the resulting encoding made "a majority of active participants unhappy," and that a "band-aid" solution was applied. 4. That the encoding presently under consideration (N2698), if accepted, would seriously damage the field of cuneiform research. These issues will now be examined point by point. 1. Certain principles were agreed upon by "the encoding group" at an early stage, on the lines of following the traditional scribal and assyriological understanding of what are cuneiform signs in the context of their diachronic development. Anderson begins by stating that "[t]he group which has been preparing a proposal for Cuneiform encoding when [sic; went?] through several stages." It is not clear who is being envisaged here, whether one or the other of the Initiative for Cuneiform Encoding (hereinafter ICE) working groups, the general readership of the cuneiform@unicode.org list, Everson, Feuerherm, and Tinney, or someone else. Either way, however, the premise of his introductory paragraph, that "[d]ecisions included the encoding of signs in the sense traditionally understood in the field" is false and misleading. During the first ICE conference (ICE 1), one of the ideas discussed was the simple and straightforward notion that one simply encode all the signs [as they are traditionally understood] and be done with it, but this was no more than one opinion expressed among many. This approach did not find general consensus and no commitment was made by anyone to necessarily abide by tradition, whether this means the understanding of original scribes or the analysis of cuneiform scholars over the last century and a half, which, for the most part, follows that understanding. Instead, participants realized that they were presented with the opportunity to engage in an analysis and to design an encoding which would be more flexible and valuable to further research than a restrictive conservative model. The basic principles which were advanced, reached consensus (including by Anderson!), and were adopted, were essentially as follows: a. To encode signs which presented a single visual unit as characters in their own right; b. To encode signs which, at some point in their evolution, presented multiple visual units in linear sequence, as strings of characters corresponding to their constituent units, rather than as characters in their own right. It is important to stress that both principles achieved complete consensus on the part of ICE 1 participants, which included among others Steve Tinney, Jerrold Cooper, and assyriologist and specialist in Sumerian, myself, Alasdair Livingstone, an assyriologist and prominent cuneiform researcher from Birmingham, and Simo Parpola, an assyriologist and foremost expert on NeoAssyrian dialect and script. In other words, the major pertinent periods and areas of cuneiform script were represented by appropriate experts taking part in the decision making process. 2. At some later stage, the "encoding group" departed from these principles and went ahead with an encoding which did not conform to the agreed upon plan. This point has essentially been answered under the discussion of the former. The history which Anderson suggests here relies entirely upon a false dichotomy. The "encoding of signs in the sense traditionally understood" was never accepted as an encoding principle, so that one can hardly argue that the "encoding group" later departed from it. Consensus on the model to be followed existed from the beginning. Furthermore, these two principles among others, were reiterated by Everson and Feuerherm in N2585 in preparation for the second ICE conference (ICE 2) and these two were endorsed/reaffirmed by that conference; consensus did not deviate in this regard. An additional principle proposed by N2585, which was accepted by participants of ICE 2, and which is of some relevance in responding to Anderson, is that it was agreed that in light of the rudimentary state of palaeographical research into the earliest stages of cuneiform development, material from periods prior to the so-called Ur III (or NeoSumerian) period would be considered only insofar as it was clear and straightforward and already known to participants (i.e. not requiring substantiating research). Thus, Archaic Cuneiform was explicitly excluded for the time being. Since these decisions were made, other leading assyriologists including Robert Englund (*the* specialist in the Archaic period) and Gerfrid Müller (a specialist in Hittite), and countless others, too numerous to mention here, have been aware of and involved in the encoding discussions, and not a one of them has ever found fault in the principles outlined above. 3. That the resulting encoding made "a majority of active participants unhappy," and that a "band-aid" solution was applied. The application of principle (b) above did lead to a fork in the path, alluded to by Anderson, in that a number of compound signs which appear as linear strings of visual units were not necessarily composed of strings of entities all of which could be represented as otherwise identified signs (i.e. tokens given character status under principle (a)). In the interests of preparing an interim proposal, Everson, Feuerherm, and Tinney decided to provisionally grant character status to these items in the interest of following the agreed upon principles faithfully. There was a certain concern on the part of more conservative assyriologists that this might lead later users of the encoding to confuse such items with legitimate signs properly speaking. This was mainly a psychological concern, and not one of technical significance or consequence. In the interests of maintain consensus, therefore, these "artificial" characters were removed from the encoding, and signs of this type were reassigned to the (a) category. Whether or not to encode such sign "components" in a separate, clearly identified block is under consideration for a later date, however. It is important to state here, contra Anderson's general argumentation, that such "components" do in fact appear in scholarly literature, notably in describing the organization of traditional sign lists, and that their encoding would therefore be an asset to the community -- an issue which Anderson does not appear to have considered at all. This demonstrates that it is the conservative approach which is generating the additional complexity in encoding, and therefore doing the disservice, rather than the other way around. 4. That the encoding presently under consideration (N2698), if accepted, would "be very damaging to the field of cuneiform studies." Certain of Anderson's arguments reveal a lack of familiarity with the field of assyriological research, suggesting that it would be unsafe to accept his analysis. One example may suffice here. In the second paragraph of his paper, Anderson discusses entities of the form "SIGN.SIGN" (a notation which can reflect two underlying realities), and implies that "the encoding group" was not able to distinguish between the underlying realities and (presumably irresponsibly) "decided to go ahead without yet attempting to consider all of the consequences." It is true that such notation can reflect either (a) a sequence of legitimate signs or (b) a description of an unclear passage, in terms of signs which are known. The second usage corresponds to what Anderson labels as "a glyph description language." It is not true, however, that "an easy distinction of the two" is not possible. Competent assyriologists are well able to differentiate between them. It is (perhaps) only certain lay people who cannot -- as would be true for any other field. The charge that the "encoding group" threw "glyph description items" in with legitimate sign encodings is preposterous to say the least. The items encoded under the current proposal are those whose existence is clear and unambiguous. Items which require further study (and this really relates to the pre-Ur III period) have been left aside, as per consensus agreement. It seems that Anderson's whole argument rests upon the fact that the proposed encoding deviates from "tradition" in that it recognises as significant units in the encoding tokens beneath the level of signs, in some cases, as generally understood -- that this is, somehow, an invalid or dangerous approach. The argument begs the question. It is quite simply pointless to spend pages and pages demonstrating that we have departed from traditional understanding. As outlined above, this decision was made openly, with consultation by leading assyriologists and Unicode experts (most notably M. Everson, R. McGowan, and K. Whistler). The pros and cons were weighed in public meetings and discussed extensively over the last four years or so. Nothing has been done in the closet, and nothing without serious consideration of the consequences. In our view, the approach taken here is preferable over that suggested by Anderson, in that it offers greater flexibility (in some cases allowing for addressing at a lower level), and those who wish to shut their eyes to the sub-division of signs are perfectly able to do so. 5. Final Points and Summary Having addressed the fundamental technical concern, I wish to draw attention to another matter of concern of another nature. It strikes me that the Anderson paper is idiosyncratic, and at some level a simple polemic among those who have chosen not to follow his views. Thus, for example, the page labelled "Cuneiform Signs" and subtitled "Two approaches to encoding Cuneiform signs are here contrasted," purports to contrast the encoding under consideration with Anderson's. The former, labelled "A," is, however, described as "Ken Whistler's." I cannot, for one moment, conceive of the rationale underlying this description. That it accurately reflects Anderson's belief is adequately documented by his recent posts on the public list in response to Ken Whistler's arguments supporting our proposal. I wish to emphatically point out that "Ken Whistler" is not the author of the proposed encoding, did not craft it, dictate it, or otherwise sell it to Everson, Feuerherm, and Tinney, nor to the vast majority of participating cuneiform scholars who support our proposal. Ken has been actively involved as a consultant who offered answers when *we* posed questions. The model presented in N2698 is based on the consensus decisions arrived at at ICE 1, reiterated by Everson and Feuerherm in N2585, ratified at ICE 2, and supported by (as far as I know) all participating and interested assyriologists, and a majority of lay persons. It is certainly possible, and presumably likely, that some people somewhere (besides Anderson) will be at least partly dissatisfied with the present proposal: 100% consensus among all professionals, most of who would not understand the technical issues in any case (no slight intended), is an unrealistic goal. As far as we can see, however, the expressed dissatisfaction to date has been minimal, so little in fact that only thirteen modifications to the original proposal were requested and required action in comparison with the 952 characters proposed. Whether non-professionals expressed dissatisfaction may be more difficult to assess (from my vantage point), but certainly there has been no great outcry of the type Anderson is suggesting. In fact, the main tone of commentary on cuneiform@unicode.org of late has not been expressing dissatisfaction with the work of Everson, Feuerherm, and Tinney, but rather with the revisionist views of Anderson. The majority of public opinion is staunchly in favour of that proposal. It would be, I think, an abuse of the time of the UTC to go into further details here; specific questions can be addressed and I am confident satisfactorily answered by Steve Tinney, Ken Whistler, and others who will be at hand. The main issues of import are not the many confused details adduced by Anderson, but the following: * N2698 follows the base principles which achieved consensus and have been repeatedly ratified; * it corrects the very few concerns expressed by professionals following the tabling of N2664; * it has the support of all actively involved assyriologists, and many if not all non-actively involved but interested assyriologists; * it enjoys the support of the majority of interested lay-persons; * it has been constructed (as to write-up) by two assyriological professionals who not only understand the field but have reasonable competence in computer science with continuous consultation with other recognized professionals from the field; and * has benefited from active involvement of highly competent professionals representing Unicode, namely Ken Whistler, Michael Everson, and Rick McGowan. It is true that the current proposal does not slavishly follow the established conventions of the ancient scribes or the assyriological tradition of the last century. But, then again, the present encoders are not ancient scribes or assyriologists of the last century, but rather competent scholars designing a vision for the future. To remain with a 5,000 year old viewpoint on principle is to sacrifice flexibility and vision for the future. The encoding which is here proposed will allow encoding of all known signs; it allows in many cases for the immediate encoding of other potential, not yet know signs through its flexibility, without encumbering the architecture with the complexities (and implementation expense) of a completely dynamic model, and it permits users of the encoding to address, at the character level, components of certain compound signs of linear structure without difficulty. In short, it can do all that a conservative model such as that proposed by Anderson would be able to do, and more, and with less rigidity. I sincerely hope that the UTC will consider seriously the consensus testimony of the assyriological community, which, it is hoped, is quite capable of deciding for itself whether or not its needs are being met. Sincerely, Karljürgen G. Feuerherm Assyriologist Wilfrid Laurier University 75 University Avenue West Waterloo, Ontario Canada