L2/05-270 Title: WG2 Consent Docket (Sophia Antipolis) Date: September 21, 2005 Author: Ken Whistler To assist the UTC in finalizing all the approvals for the Unicode 5.0 repertoire (and future versions), I have, as usual, collected together in this document all of the currently unresolved discrepancies between approvals of record by the UTC for character additions and the latest status of WG2 decisions regarding character additions for ISO/IEC 10646. The relevant documents from the most recent WG2 meeting, in Sophia Antipolis, France are: WG2 N 2954 Resolutions of WG 2 meeting 47 (= L2/05-271) WG2 N 2991 Summary of repertoire for FDAM 2 of ISO/IEC 10646 (= L2/05-272) WG2 N 2993 Summary of repertoire of PDAM 3 of ISO/IEC 10646 (= L2/05-273) ============================================================== Part I: Discrepancies relevant to Amd 2 and Unicode 5.0 This part is concerned with discrepancies related to Amd 2, which will be issued soon for an FDAM ballot and which should define the repertoire for Unicode 5.0. At this point, no further technical change to Amd 2 is feasible, and the UTC should simply approve all of these changes and/or additions, to avoid any chance of a desynchronization of the standards. A. Uralicist character additions At the WG2 meeting, Finland brought in a request for the addition of 6 Uralicist characters for Amd 2. The UTC saw these at the August meeting (L2/05-189, = WG2 N 2958), but took no action, as everyone misinterpreted the "progress report" as simply an FYI. At WG2 it became apparent that the report was actually an urgent request for the addition of 6 characters. That request was restated, much more clearly, in WG2 N 2989 (= L2/05-261). That document represented the consensus of an ad hoc group at the meeting, which concluded that all 6 characters were justified. The 6 characters were accepted into FDAM 2. The UTC needs to formally approve them as well. 1DFE COMBINING LEFT ARROWHEAD ABOVE 1DFF COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW 27CA VERTICAL BAR WITH HORIZONTAL STROKE 2C77 LATIN SMALL LETTER TAILLESS PHI A720 MODIFIER LETTER STRESS AND HIGH TONE A721 MODIFIER LETTER STRESS AND LOW TONE B. Horizontal mathematical bracket The Irish NB ballot comments asked for a bottom version of the horizontal tortoise shell bracket, to match the already approved U+23E0 TOP TORTOISE SHELL BRACKET. The UTC had considered this as a symmetry issue, but hasn't approved the addition. The original proposal (L2/04-329) hadn't asked for it and noted: "MathType also supports the top tortoise shell bracket, but no bottom bracket." Asmus has indicated that in consultation with the math experts, it is clear that these brackets would be used consistently, with top forms for annotations above numerators in stacked expressions and bottom forms below denominators -- so there really is no reason not to have them in symmetric pairs. In any case, WG2 approved the bottom tortoise shell bracket, and the UTC needs to formally approve it, as well as a code point change for another character that resulted. 23E1 BOTTOM TORTOISE SHELL BRACKET And to keep the horizontal brackets in a contiguous range, the following code point change: 23E1 --> 23E7 ELECTRICAL INTERSECTION C. Code points for already approved Latin characters There was a screwup in code point allocation at the May UTC meeting for the addition of two Latin characters. At the February UTC meeting, 6 Latin characters for Uighur were accepted in the range U+2C65..U+2C6A. (102-C10, L2/05-029) At the May UTC meeting, 2 lowercase Latin characters with stroke (for Sencoten) were accepted in the range U+2C65..U+2C66. On May 25, I notified Lisa and Rick of the code point conflict, while updating the pipeline, and suggested a correction to the open code points U+2C6B..U+2C6C. The pipeline has reflected that state of affairs since that date. However, corrected minutes for the May UTC meeting have been in limbo, and when Asmus prepared preliminary (FDAM 2) ballot documents for the WG2 meeting, he resolved the code point conflict differently, by leaving the two characters from the May meeting alone, and instead moving the 6 Uighur characters down by two code points. In the interest of minimizing the possibility for further errors in document preparation, WG2 chose to accept all 8 characters at the code points shown in the preliminary ballot documents. The UTC needs now to formally accept these modified code points, so that everything is back in synch. For absolute clarity, the entire list of 8 is given here. What is in question is not the names or identities of the 8 characters, but merely an affirmation of the code points assigned by WG2, which will be in the FDAM 2. 2C65 LATIN SMALL LETTER A WITH STROKE 2C66 LATIN SMALL LETTER T WITH DIAGONAL STROKE 2C67 LATIN CAPITAL LETTER H WITH DESCENDER 2C68 LATIN SMALL LETTER H WITH DESCENDER 2C69 LATIN CAPITAL LETTER K WITH DESCENDER 2C6A LATIN SMALL LETTER K WITH DESCENDER 2C6B LATIN CAPITAL LETTER Z WITH DESCENDER 2C6C LATIN SMALL LETTER Z WITH DESCENDER D. Cuneiform The U.S., Ireland, and Canada all asked for a coordinated set of minor updates (name and glyph changes, some removals) to a small number of Sumero-Akkadian cuneiform characters in the ballot, based on input from the expert, Steve Tinney. Michael Everson fielded a few more last-minute fixes for glyphs and names from Tinney, and those were also incorporated in the revised chart and names list seen in WG2 N 2991 prepared for the FDAM 2 ballot. The UTC should give another "just in case" approval for the revised chart and character name as shown in WG2 N 2991, to pick up these last revisions. E. N'Ko Name Changes WG2 agreed to 4 N'Ko character name changes requested by Ireland. The UTC needs to accept these name changes. The list is: 07E8 NKO LETTER JONA JA (< ... OLD JA) 07E9 NKO LETTER JONA CHA (< ... OLD CHA) 07EA NKO LETTER JONA RA (< ... OLD RA) 07F6 NKO SYMBOL OO DENNEN (< ... OO DEENE) F. Phags-pa Glyphs Andrew West requested a number of minor glyph changes for Phags-pa in documents WG2 N 2972 (= L2/05-255) and WG2 N 2979 (= L2/05-257). The UTC took no position on this. In consultation with China and Ireland at the WG2 meeting, an acceptable set of glyph changes were agreed upon. The characters impacted are: A843..A845, A852, A856..A857, A859, A863..A864, A867..A868, A870..A871 All of these changes are rather minor, and were acceptable to China. With the exception of the change A857, which was a matter of choice between two alternative glyphs, the rest of the changes amount to replacing a short continuation bar at the lower right of a glyph with a small serif (or nothing at all). The UTC should simply agree to the glyphs as now shown in WG2 N 2991. Note: WG2 agreed to fixes for all of the glyph errata that the UTC had posted, so with agreement on these Phags-pa glyph changes, the UTC and WG2 should once again be in synch in agreeing about the glyphs to use in the code charts. =========================================================== Part II: Discrepancies relevant to Amd 3 and Unicode 5.x Unlike the discrepancies related to the FDAM for Amd 2, the other discrepancies listed in this part deal with WG2 approvals for a PDAM ballot for Amd 3. In these cases the UTC has greater flexibility, as there are two rounds of national body ballotting to go, and there is leeway to disagree and change things. The choices now are either to 1. simply approve what WG2 has done and move on 2. disapprove (in part or in whole) now and take the relevant actions to start preparing positions for ballot comments, or 3. take no action now, and wait until actually confronted with the discrepancy later when reviewing the PDAM In most cases I don't advise option 3, unless there is significant potential for ratholing in discussion of some topic now, as it raises the risk that we will overlook a discrepancy in the future in responding to a ballot. (And it is more work for me, Michel, and Asmus to track things.) G. Inverted Interrobang WG2 approved: 2E18 INVERTED INTERROBANG on the basis of WG2 N 2935 (= L2/05-086). The UTC has discussed the character in extenso, but not yet approved it. H. Musical symbol WG2 approved: 1D129 MUSICAL SYMBOL MULTIPLE MEASURE REST on the basis of WG2 N 2983 (= L2/05-258). An earlier version of that document was discussed on the unicore list, and it is my opinion that the resolution in L2/05-258 is probably the best way to go on the issues discussed there. I. Lepcha /ng/ WG2 approved the entire proposal for Lepcha, based on WG2 N 2947 (= L2/05-158). That includes one character not yet approved by the UTC: 1C35 LEPCHA CONSONANT SIGN KANG The UTC needs to review the evidence and decide what to do. J. Old Chiki WG2 approved the encoding of the Ol Chiki script at a new block 1C50..1C7F, based on document WG2 N 2984 (= L2/05-243). The encoding is not problematical -- in fact it is elaborately and profusely praised by at least some of the Santali-speaking community who want to use it. (But recall, there was some fuss awhile ago by a counter-community that disapproves of the use of the old script at all.) I would recommend that the UTC simply approve the encoding at this point. K. Saurashtra WG2 approved the encoding of the Saurashtra script at a new block A880..A8DF, based on document WG2 N 2969 (= L2/05-222). There has been a lingering disagreement with Peri Bhaskararao regarding one character in the proposal (U+A8B4 SAURASHTRA LETTER UPAKSHARA), but at this point my recommendation would be to approve the entire proposed encoding to synch up with WG2 and to let the discussion about that one character continue and resurface if it needs to.