L2/08-205 Title: WG2 Consent Docket Source: Ken Whistler Date: April 30, 2008 Following my usual procedure, I have rolled up all items from the latest WG2 meeting (WG2 #52, Redmond, WA, April 21 - 25, 2008) for which there is a synchronization issue that the UTC needs to address. This WG2 meeting progressed 2 amendments: Amendment 5: The disposition of comments was completed for FPDAM 5, and FDAM 5 will be issued soon. Amendment 6: The disposition of comments was completed for PDAM 6, and a second PDAM 6-2 will be issued soon. For the overall summary of the repertoire for those two amendments, pending production of the actual amendments, you can refer to: WG2 N3465 (= L2/08-178) Result of Repertoire Review for FDAM5... WG2 N3466 (= L2/08-179) Result of Repertoire Review for PDAM6.2... In the consent docket this time, I have organized the issues by which amendment they are associated with, to help keep things straight. Under current plans, the new repertoire for Amendments 5 and 6 will be targeted for a future version of Unicode, most likely to be designated Version 5.2. Note that for changes specific to FDAM 5, the UTC really just has to approve things at this point, as there is no chance now to reconsider the decisions by WG2; the FDAM vote is an up or down vote, with no technical changes allowed. ================================================================ A. New Tai Lue Glyph Erratum (FDAM 5) WG2 N3380 (= L2/08-036) noted a problem for U+19D1 NEW TAI LUE DIGIT ONE. The glyph needed correction to the "hora" form of the digit, instead of the "tham" form, to make way for the encoding of the tham digit one, which is actually distinguished in New Tai Lue. (See item Q below.) Ordinarily glyph errata are just handled editorially, but since this one involves the justification for encoding a new character, it is probably best to be explicit about this. Recommendation: Approve the glyph change for U+19D1 NEW TAI LUE DIGIT ONE. ================================================================== B. Additional CJK Ideographs (FDAM 5) The FPDAM 5 contained 6 additional CJK ideographs, for mapping to the Japanese ARIB STD B64. Two of those were CJK unified ideographs: 9FC4..9FC5 (= ARIB #47, #95) The UTC already approved those two in October. The other four were CJK compatibility ideographs: FA6B..FA6E (= ARIB #39, #67, #93, #105) The UTC withheld approval of those 4, and the joint UTC/L2 recommendation, submitted in ballot comments, was to map the last four using IVS, rather than encoding compatibility characters for them. Faced with opposition from Japan, WG2 declined that route. Based on input from Japan in WG2 N3318, the outcome for FDAM 5 was slightly modified in the Disposition of Comments. The existing 9FC4..9FC5 were unaffected, but one of the compatibility characters (ARIB #93) was moved to be a CJK unified ideograph, instead. The net net of this is that FDAM 5 contains the following 4 additional CJK characters not yet approved by the UTC: 9FC6 (= ARIB #93) FA6B..FA6D (= ARIB #39, #67, #105) Recommendation: Approve the 4 additional CJK ideographs (one in the CJK Unified Ideographs block and three in the CJK Compatibility Ideographs block). ================================================================ C. CJK Extension C (FDAM 5) O.k., this is a *big* one. The FPDAM revised repertoire for Extension C stayed unmodified: 2A700..2B734 (4149 additions) CJK Unified Ideographs Extension C In discussion of the prior WG2 consent docket last October, the UTC withheld approval for Extension C, based on the premise that the PDAM 5 Disposition of Comments had resulted in a large number of changes, and further changes might yet occur in review. At this point, the repertoire has moved past its FPDAM Disposition of Comments, and no further technical changes are allowed. Recommendation: Approve the CJK Unified Ideographs Extension C repertoire and block. ================================================================= D. Sinhala Numbers (FDAM 5) WG2 accomodated the U.S. request (based on feedback from Sri Lanka at the last UTC meeting, see L2/08-007) to remove the Sinhala digits (0DE7..0DEF) and other Sinhala numbers (0DF5..0DFF) from Amd 5, pending further input from Sri Lanka. Recommendation: No action needed. The former UTC approval can stand. These are simply on hold for ballot until further feedback is received. Any formal decision to modify the former approval can wait until then. ================================================================= E. Avestan Separation Point (FDAM 5) WG2 accomodated the U.S. request to remove U+10B38 AVESTAN SEPARATION POINT from Amendment 5. Recommendation: No action needed. This is just to note the outcome of the other outstanding place where the UTC character approvals had not been synched up with the WG2 approvals. ================================================================= F. Miscellaneous Name Corrections (FDAM 5) The resolutions for progressing FDAM 5 note the following name corrections: 11FD HANGUL JONGSEONG KIYEOK-KHIEUKH A96E HANGUL CHOSEONG RIEUL-KHIEUKH A973 HANGUL CHOSEONG PIEUP-KHIEUKH AAB7 TAI VIET MAI KHIT All of these were just corrections of single-letter typos in the names list that was balloted for FPDAM 5. The UTC didn't record specific names for these when the Old Hangul jamos and Tai Viet were approved, since the approvals simply referenced other documents. Nevertheless, the UTC should go on record as approving these changes, which are technical corrections for what was ballotted. Recommendation: Approve the four character name corrections as noted. ================================================================= That's it for FDAM 5. The synchronization issues for PDAM 6-2 are rather more wide-ranging, since a lot of new scripts and characters were added, some of which have not yet been formally reviewed by the UTC. The following items G through Q are all associated with PDAM 6-2. ================================================================= G. Meetei Mayek Meetei Mayek (formerly spelled "Meitei Mayek") had been on hold, pending input. WG2 added it to PDAM 6-2, based on WG2 N3470 (= L2/08-180). N3470 reflected input received from India and from Manipur in particular, indicating preferences for spelling the script (and its letters) as "Meetei", and for reordering the characters to use the 27-letter alphabet currently mandated in education in Manipur state. The UTC's approval status for Meetei Mayek reflects the "Meitei" spelling and the older Sanskrit alphabetic order from earlier proposal documents. Note that the WG2 approval based on N3470 also includes the resolution of the danda issue. The UTC had approved Meetei without the two danda punctuation characters, pending progress on the general principles about encoding dandas. Now that WG2 has agreed to incorporate the proposed principles about encoding dandas, the decision was taken to move ahead on approval of dandas for Meetei, as well. So the script as currently approved for PDAM 6-2 ballotting includes the two danda characters: 1CAC MEETEI MAYEK DANDA 1CAD MEETEI MAYEK DOUBLE DANDA So it is now 78 characters (not 76), with the script and block name Meetei Mayek (not Meitei Mayek). The block range is the same: 1C80..1CCF. The characters are all reordered. Recommendation: Reconfirm approval of encoding of Meetei Mayek, with changed names and code points, including the two danda punctuation marks, as per WG2 N3470. ================================================================= H. Old Turkic Old Turkic (Orkhon), a right-to-left runic script, was added to PDAM 6-2 on the basis of WG2 N3357 (= L2/08-071). The UTC has reviewed earlier iterations of the various proposals for Old Turkic, but has not yet reviewed the final proposal nor approved it for encoding. N3357 responded to some of the criticisms of the earlier proposals, and in my opinion is about the best consensus we are going to get on this script. There is a fair amount of duplication of encoding of the "same letter" based on regional variations in preferred letter shapes -- but this is pretty much par for the course for runes, and is no different in principle than what we ended up doing for the Western and Northern European runes in the Runic block. The repertoire in N3357 and for PDAM 6-2 consists of 71 characters in the range 10C00..10C46, in an Old Turkic block 10C00..10C4F. Note that this is a right-to-left runic script. Recommendation: Approve the Old Turkic script, as per WG2 N3357. ================================================================= I. Nushu WG2 approved Nushu for PDAM 6-2 based on WG2 N3462. WG2 N3426 (= L2/08-171) was the main current proposal for Nushu, containing 382 characters at 1B000..1B17D. N3462 is an update that added 7 more characters at 1B17E..1B184, for a total of 389. China then promised to submit another revision that reordered the final 7 into the main set of characters. That is supposed to be in N3463, but the document link is still pending. The Nushu proposal documents were much improved from prior versions, and include much more mapping information than before. But in my opinion the move to ballot was a little premature still, given outstanding questions about mapping of some sources and the principles whereby the 389 were chosen. Also, given the rush and the unavailability of a printed update or revised font during WG2 -- which means that the Nushu characters are not part of the Summary of Repertoire documents, it is still a little unclear whether the PDAM 6-2 ballot will contain the ordering from N3462 or the promised ordering from N3463. Recommendation: Discuss and review the documents, but withhold approval for encoding the Nushu script until the next UTC meeting, when we can see what is actually under ballot in PDAM 6-2. ================================================================= J. Rumi Numeral Symbols WG2 approved the Rumi Numeral Symbols for PDAM 6-2. The UTC has already approved the repertoire and code points, and those match. The only difference is that the UTC approved "Rumi Symbols" for the block name, and WG2 chose "Rumi Numeral Symbols". The block range is 10E60..10E7F. Recommendation: Approve "Rumi Numeral Symbols" as the block name, to match the WG2 resolution. ================================================================= K. Myanmar Additions for Khamti Shan, Aiton, and Phake WG2 approved 18 more Myanmar characters for minority languages, with names and code points based on WG2 N3436 and glyphs based on WG2 N3423 (= L2/08-181). These go into PDAM 6-2. In the existing Myanmar block, these consist of: 109A MYANMAR SIGN KHAMTI TONE-1 109B MYANMAR SIGN KHAMTI TONE-3 109C MYANMAR VOWEL SIGN AITON A 109D MYANMAR VOWEL SIGN AITON AI The other 14 were added in a new Myanmar Extended-A block (AA60..AA7F), in the range AA60..AA6D. Recommendation: Review and approve these additions for Myanmar. ================================================================= L. Japanese TV Symbols This refers to the set of ARIB characters that the UTC has been working on for some time. WG2 accepted a set of 186 symbols for ballot in PDAM 6-2, based on WG2 N3469. Michel Suignard had updated the proposal document extensively to account for various feedback he had received since the last UTC review, and then the entire set received further work during WG2, updating various code points, adjusting names and annotations, and modifying slightly the repertoire to be encoded versus the unifications with existing encoded characters. The result of all this was rolled into a revised proposal document (N3469) which was then used for the approval resolution. The additions are scattered in various existing symbols blocks (Number Forms, Miscellaneous Symbols, Enclosed CJK Letters and Months), and then two new blocks on the SMP are proposed: Enclosed Alphanumeric Supplement (U+1F100..U+1F1FF) and Enclosed Ideographic Supplement (U+1F200..U+1F2FF). I won't try to replicate the entire repertoire here -- they are all listed in N3469. Recommendation: Review and approve these additions for Japanese TV symbols. ================================================================= M. Kaithi WG2 approved the Kaithi script for PDAM 6-2, with code points as already approved by the UTC, and in the same block range, 11080..110CF, based on WG2 N3389 (= L2/07-418) except that the 10 digits were omitted. This change was based on the author's documented request in WG2 N3438 (= L2/08-156) that the digits simply be unified with Devanagari digits, instead. Recommendation: Rescind the UTC approval of the 10 characters for Kaithi digits, 110C0..110C9. ================================================================= N. Old South Arabian WG2 approved the Old South Arabian repertoire for PDAM 6-2, as approved by the UTC, but moved the block. The UTC assigned 10A80..10A9F, but after ad hoc consultation among the roadmappers, WG2 decided on 10A60..10A7F, instead. Otherwise there were no changes to character names or glyphs or relative ordering. Recommendation: Update the approved Old South Arabian block range to 10A60..10A7F, with all code points revised accordingly. ================================================================= O. Vedic Additions WG2 approved the Vedic additions based on Debbie Anderson's summary of the consensus position in WG2 N3456 (= L2/08-176), and with glyphs as shown in the main proposal document WG2 N3383 (= L2/08-050). In terms of character approvals at this point, we are nearly in synch. WG2 did not put U+094E DEVANAGARI VOWEL SIGN PRISHTHAMATRA E on the ballot, but that was the proper outcome, because we are waiting for more Government of India input on that particular character before proceeding further. The one thing that needs resynchronization are the names of several of the Vedic tone characters. Debbie turned up several inconsistencies in naming among the various source documents between characters names "VEDIC SIGN..." and characters named "VEDIC TONE..." This issue was ad hocced, and the consensus position in WG2 N3456 listed "VEDIC TONE..." for all of the inconsistent characters. Accordingly, the UTC needs to update its approvals to note name corrections for the following characters: 1CD2 VEDIC TONE PRENKHA 1CE2 VEDIC TONE VISARGA SVARITA 1CE3 VEDIC TONE VISARGA UDATTA 1CE4 VEDIC TONE REVERSED VISARGA UDATTA 1CE5 VEDIC TONE VISARGA ANUDATTA 1CE6 VEDIC TONE REVERSED VISARGA ANUDATTA 1CE7 VEDIC TONE VISARGA UDATTA WITH TAIL 1CE8 VEDIC TONE VISARGA ANUDATTA WITH TAIL Recommendation: Approve the listed character name changes. ================================================================= P. UCAS Additions WG2 added a number of Unified Canadian Aboriginal Syllabics characters, based on WG2 N3427 (= L2/08-132). Some of these were added in the existing UCAS block, filling it up: U+1400 CANADIAN SYLLABICS HYPHEN U+1677..U+167F various syllables for Woods-Cree and Blackfoot The rest were added in a new block, Unified Canadian Aboriginal Syllabics Extended-A (range A9E0..A9FF) at the code points U+A9E0..U+A9FC. These are additional syllables and finals for Ojibway, Beaver, Carrier, and others. Total of 39 characters added. Of these there are a few characters with potential lookalike or property problems that should be discussed by the UTC (the hyphen, which looks like an equals sign, and a couple more "dots"), but the simplest way to stay in synch right now is to approve the set provisionally, and then decide if anything needs to be addressed in ballot comments later. Recommendation: Approve the repertoire, code points, and glyphs for the UCAS additions. ================================================================= Q. New Tai Lue digit The ordinary "hora" form of the New Tai Lue digit for one can be confused with the letter for the vowel sign AA. Because of this, New Tai Lue also borrowed in a "tham" form of the digit for one, which is used in contexts where disambiguation is required. This tham form is not used for regular number formation. Because this is conceived of as a character contrast, rather than simply alternate glyphs (both forms can occur contrastively in plain text), WG2 decided to encode a new character for PDAM 6-2, based on WG2 N3380 (= L2/08-036): 19DA NEW TAI LUE THAM DIGIT ONE This was an addition coordinated with the glyph fix for the existing 19D1 NEW TAI LUE DIGIT ONE (see item A above). Recommendation: Approve this character addition for New Tai Lue. ================================================================= R. Currency Signs In addition to the Livre Tournois sign that the UTC has already approved, WG2 added two more new currency signs for PDAM 6-2: 20B7 SPESMILO SIGN (based on WG2 N3390 = L2/08-115) 20B8 TENGE SIGN (based on WG2 N3392 = L2/08-116) Recommendation: Approve both new currency signs.