L2/05-289 Date: October 6, 2005 Title: Cuneiform property inconsistencies Source: Ken Whistler Action: For review and decision by the UTC In reviewing Cuneiform character properties for Unicode 5.0, I turned up an inconsistency in the way a few Old Persian Cuneiform characters were treated as of Unicode 4.1. I think we should reconsider those in the context of similar property assignments made (earlier) for Ugaritic and (currently) for Sumero-Akkadian, to make them maximally consistent across the 3 Cuneiform scripts. In particular, L2/04-354 summarized the proposed character properties for Sumero-Akkadian. One of its conclusions was that the Bidi_Class for all the Sumero-Akkadian signs should simply be "L" consistently, including the 4 punctuation marks. And the punctuation marks themselves were given gc=Po. This is consistent with the way Ugaritic has been treated. In particular: 1039F;UGARITIC WORD DIVIDER;Po;0;L;;;;;N;;;;; However, there are two anomalies in the Old Persian data, as currently defined. 1. The Old Persian word divider is unaccountably gc=So, instead of gc=Po 103D0;OLD PERSIAN WORD DIVIDER;So;0;L;;;;;N;;;;; 2. The Old Persian numerals are unaccountably bc=ON, instead of bc=L 103D1;OLD PERSIAN NUMBER ONE;Nl;0;ON;;;;1;N;;;;; 103D2;OLD PERSIAN NUMBER TWO;Nl;0;ON;;;;2;N;;;;; 103D3;OLD PERSIAN NUMBER TEN;Nl;0;ON;;;;10;N;;;;; 103D4;OLD PERSIAN NUMBER TWENTY;Nl;0;ON;;;;20;N;;;;; 103D5;OLD PERSIAN NUMBER HUNDRED;Nl;0;ON;;;;100;N;;;;; To maximize the consistency between the way Ugaritic and Sumero-Akkadian are treated on the one hand, and Old Persian on the other, I propose that the 6 Old Persian characters in question have their properties corrected in Unicode 5.0 to: 103D0;OLD PERSIAN WORD DIVIDER;Po;0;L;;;;;N;;;;; 103D1;OLD PERSIAN NUMBER ONE;Nl;0;L;;;;1;N;;;;; 103D2;OLD PERSIAN NUMBER TWO;Nl;0;L;;;;2;N;;;;; 103D3;OLD PERSIAN NUMBER TEN;Nl;0;L;;;;10;N;;;;; 103D4;OLD PERSIAN NUMBER TWENTY;Nl;0;L;;;;20;N;;;;; 103D5;OLD PERSIAN NUMBER HUNDRED;Nl;0;L;;;;100;N;;;;; The inconsistency in treatment also extends to the Line_Break property, which for Old Persian was consonant with the gc=So assignment, but which should be changed to fit with the other Cuneiform word divider characters. To wit: 1039F;BA # UGARITIC WORD DIVIDER 103D0;AL # OLD PERSIAN WORD DIVIDER 12470;BA # CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER I propose that for Unicode 5.0, the Old Persian word divider be updated to lb=BA, again for best consistency. I think the problems for Old Persian simply resulted from oversights during the beta period review of the UCD for Unicode 4.1. .