L2/13-154 Title: WG2 Consent Docket Author: Ken Whistler Date: July 22, 2013 Action: For consideration by UTC WG2 #61 met in Vilnius, Lithuania, the week of June 10 - 14, 2013. During that meeting a number of resolutions were taken which progressed Amendment2 to 10646 3rd Edition and which also progressed the CD for 10646 4th Edition. See L2/13-144 (= WG2 N4404) for the full details of all the resolutions. As usual, in this consent docket, I summarize just the parts of the actions taken by WG2 which result in a different status between WG2 and the UTC regarding various character approvals. These are the differences where the UTC needs to make some decision regarding how to synchronize approvals (or to oppose a proposed change). For convenience, the changes are grouped here by amendment (or CD). Note that the pipeline page: http://www.unicode.org/alloc/Pipeline.html has already been updated to reflect changes in approvals by WG2, and to highlight differences from the current approvals by the UTC, so that page can be useful in following the discussion below on the individual issues. ======================================================================= Changes Related to Amendment 2 Amendment 2 is now being progressed to FDAM status. The FDAM ballot is a non-technical ballot, so at this point it is too late to make further technical changes on its content. I recommend that the UTC simply approve the few points where it is out of synch with former UTC approvals. Note that with the additions from Amendment 2, the anticipated repertoire of Unicode 7.0 is now complete. The full listing of the revised Amendment 2 content for FDAM balloting, including a number of glyph changes, can be seen in WG2 N4458 (= L2/13-150). A. MANAT SIGN WG2 approved the addition of U+20BC MANAT SIGN. The UTC has seen this character before, but held off on approval waiting for further evidence of use. WG2 saw that additional evidence, and decided to accelerate the encoding into Amendment 2, to avoid having to do a hurry up publication of a version including just the currency sign addition. The relevant document is WG2 N4445. Recommendation: The UTC should approve U+20BC MANAT SIGN for Unicode 7.0. B. Move and rename of two phonetic characters WG2 decided to move the code points for two characters in ballot, and renamed one to avoid a name inconsistency pointed out by the German NB. The net change was from: U+A7AE LATIN SMALL LETTER INVERTED ALPHA U+A7AF LATIN LETTER SMALL CAPITAL OMEGA to: U+AB64 LATIN SMALL LETTER INVERTED ALPHA U+AB65 GREEK LETTER SMALL CAPITAL OMEGA The name change from "LATIN" to "GREEK" for the small capital omega was to avoid a shape collision problem for the Latin capital omega. It also followed precedent for some other Greek letters used for phonetic transcription. Recommendation: The UTC should approve the two code point changes and one name change for Unicode 7.0. C. Name changes for Siddham punctuation WG2 changed the names of two Siddham punctuation marks, from: U+115C4 SIDDHAM SEPARATOR-1 U+115C5 SIDDHAM SEPARATOR-2 to: U+115C4 SIDDHAM SEPARATOR DOT U=115C5 SIDDHAM SEPARATOR BAR Recommendation: The UTC should approve the two name changes for Unicode 7.0. D. Pahawh Hmong clan logographs WG2 approved 19 Pahawh Hmong clan logographs which the UTC has seen, but not yet accepted. The characters are in the range U+16B7D..U+16B8F. The complete list of character code points, names, and glyphs can be seen in L2/13-150 in the Pahawh Hmong block. Recommendation: The UTC should approve the addition of these 19 clan logographs for Unicode 7.0. E. Mende --> Kikakui --> Mende Kikakui The block which originally had been approved by WG2 as "Mende" was changed to "Mende Kikakui". The UTC had previously approved a name change from "Mende" to "Kikakui" and requested that that name be used in Amendment 2. The name change to "Mende Kikakui" was a compromise, responding to a comment from Ireland and noting the similar precedent for the name of the Bassa Vah script. Recommendation: The UTC should approve the block name change for Kikakui to Mende Kikakui (U+1E800..U+1E8DF), and the corresponding change for the names of all characters in that block, for Unicode 7.0. F. Syntax change for Ideograph Description Sequences There was an extensive discussion in WG2 regarding Japanese NB comments about ideographic description sequences. The main objection to the use of PUA characters with ideographic description sequences -- a change to the IDS syntax that the UTC had already agreed to earlier -- turned out to be relatively easy to accomodate, once it became clear that specifying a substitution character for an otherwise unencoded component would satisfy Japan's objection. As a result, after discussion, everyone concluded that use of U+FF1F FULLWIDTH QUESTION MARK was the perfect candidate to add to the syntax to indicate the presence of an unencoded component. Annex I.1 in 10646 will be adjusted to add a bullet describing the use of U+FF1F in an IDS to indicate to indicate an 'undescribed component'. The corresponding change for the Unicode Standard, to keep the syntax of IDS in synch, would be to add "| U+FF1F" to the IDS syntax in Section 12.2 of the core specification, plus a little text explaining the use of U+FF1F to indicate the presense of an undescribed component. Recommendation: The UTC should approve the described change to the IDS syntax for Unicode 7.0. =========================================================================== Changes Related to the CD for the 4th Edition The CD for the 4th Edition is now being progressed to DIS ballot. The full listing of the additional 4th Edition repertoire for DIS balloting can be seen in WG2 N4459 (= L2/13-151). G. Middle Dot WG2 reached a compromise on the long-fought-over additional middle dot letter. The compromise consisted of yet another name change and an agreement to make the dot large enough to not be easily confused for the existing U+00BF MIDDLE DOT. The current character under ballot in the DIS is: U+A78F LATIN LETTER SINOLOGICAL DOT Recommendation: The UTC should discuss the issue and decide what to do. H. Name change for Sakha Yat WG2 approved a name change for U+AB60 LATIN SMALL LETTER SAKHA IOTIFIED A to: U+AB60 LATIN SMALL LETTER SAKHA YAT Recommendation: The UTC should approve this name change. I. Hungarian --> Old Hungarian WG2 discussed Hungarian yet another time. This one was rather complicated to work out, because Hungarian was balloted in Amendment 2, and technically Amendment 2 passed, with Hungarian NB approval. However, there was information that made it clear that at least some of those in Hungary who approved the encoding as shown in Amendment 2 were not that happy about the compromise script name "Hungarian" and actually preferred "Old Hungarian". But there was no ballot comment from Hungary to that effect. Rather than risk the FDAM vote on Amendment 2 with a risky change of the script and character names yet again without opportunity for a technical vote, WG2 decided to push Hungarian one more time -- this time into the 4th Edition CD, but with the revised script and character names "Old Hungarian". This gives one more opportunity for everybody involved to review and assent explicitly to the name. One of the upshots for the UTC is that this moved (Old) Hungarian out of the repertoire being prepared for Unicode 7.0. Recommendation: The UTC should approve this name change to the block, script, and the corresponding changes to all the character names. Note that the "Old Hungarian" script name was the original one that the UTC approved, quite some time ago, and seems to be the one most acceptable to all parties except for the group arguing for some version of "ROVASH" for the name. J. Sharada: Moving some code points WG2 reviewed comments on the code points for additions of Sharada punctuation, and agreed to a few code point moves. The current UTC approval is: U+111CE SHARADA CONTINUATION SIGN U+111DB SHARADA HEADSTROKE U+111DC SHARADA SIGN SIDDHAM U+111DD SHARADA SECTION MARK-1 U+111DE SHARADA SECTION MARK-2 The revised code points in the 4th Edition DIS are: U+111DB SHARADA SIGN SIDDHAM U+111DC SHARADA HEADSTROKE U+111DD SHARADA CONTINUATION SIGN U+111DE SHARADA SECTION MARK-1 U+111DF SHARADA SECTION MARK-2 Recommendation: The UTC should approve these code point changes. K. Siddham section marks The UTC is on record as having approved 7 Siddham section marks, with names: U+115CB SIDDHAM SECTION MARK-2 ... U+115D4 SIDDHAM SECTION MARK-11 These were the clearly attested examples out of a longer list in the proposal. And the naming of the marks just followed the original generic naming in the proposal. WG2 reviewed further feedback on the Siddham section marks, and agreed to approve the full set of them, with revised, descriptive names, instead of just numbers. The details of the revised proposal can be found in WG2 N4457. The revised range and a sample of the names are: U+115CA SIDDHAM SECTION MARK WITH TRIDENT AND U-SHAPED ORNAMENTS ... U+115D7 SIDDHAM SECITON MARK WITH CIRCLES AND FOUR ENCLOSURES for a total of 14 section marks. The full list as approved can be seen in the DIS repertoire document, WG2 N4459 (= L2/13-151). Recommendation: The UTC should approve the revised repertoire, code points, and names, to get back into synch with the DIS repertoire. (Any further suggestions about names could then be taken up separately, if desired, for ballot comments.) L. Siddham letter variants In response to a Japanese request for a number of variant letters and combining marks for Siddham, WG2 decided to add a total of 6 variant letters: U+115E0 SIDDHAM LETTER I VARIANT FORM A U+115E1 SIDDHAM LETTER I VARIANT FORM B U+115E2 SIDDHAM LETTER II VARIANT FORM A U+115E3 SIDDHAM LETTER U VARIANT FORM A U+115E4 SIDDHAM VOWEL SIGN U VARIANT FORM A U+115E5 SIDDHAM VOWEL SIGN UU VARIANT FORM A The UTC has seen the original request, in WG2 N4407R (= L2/13-110), but declined to take action at the last meeting, pending further feedback and the outcome of the ballot discussion in WG2. Since then there has been some contrary feedback, as well. See L2/13-126 and commentary on the unicore list. Recommendation: The UTC should discuss and decide what to do. M. Early Dynastic Cuneiform code point changes WG2 decided to remove one duplicate Cuneiform code point. This removal was already approved by the UTC. As anticipated, WG2 then decided to remove the "hole" in the block and closed up the gap be moving code points. The revised range is: U+12480..U+12543. Recommendation: The UTC should approve the move of the subrange U+124D3..U+12544 to U+124D2..U+12543. N. Hatran WG2 decided to delete HATRAN LETTER RESH and three number signs from the Hatran block. These changes were responsive to issues raised by the UTC on the CD. HATRAN LETTER DALETH was renamed HATRAN LETTER DALETH-RESH, and the gap in the range of letters and numbers was removed by changing several code points. The revised encoding for the block can be see in L2/13-151. Recommendation: The UTC should approve the Hatran block (U+108E0..U+108FF), with 26 characters, code points, names, and glyphs as shown in L2/13-151. O. Anatolian Hieroglyphs WG2 also made a number of changes to the content of the Anatolian Hieroglyphs block under ballot in the CD, including removal of all reference to decompositions for the signs. These changes were also responsive to comments which had been made by the UTC on earlier documents. Recommendation: The UTC should approve the Anatolian Hieroglyphs block (U+14400..U+1467F), with 583 characters in the range U+14400..U+14646, with code points, names, and glyphs as shown in L2/13-151. P. CJK Extension E WG2 made a number of revisions to the repertoire under ballot in the CD for CJK Extension E, including the removal of 6 characters, moving code points to get rid of the gaps in the range, and some fixes for associated mapping data. At this point, the repertoire seems mature enough for UTC approval. Recommendation: The UTC should approve the CJK Extension E block (U+2B820..U+2CEAF), with 5762 characters in the range U+2B820..U+2CEA1, with code points and glyphs as shown in L2/13-151 (see p. 60ff of the pdf).