L2/07-150 Title: WG2 Consent Docket Source: Ken Whistler Date: May 10, 2007 Following my usual procedure, I have rolled up all items from the latest WG2 meeting (WG2 #50, Frankfurt, Germany, April 23 - 27, 2007) for which there is a synchronization issue that the UTC needs to address. This WG2 meeting progressed 3 amendments: Amendment 3: The disposition of comments was completed for the FPDAM 3, and FDAM 3 will be issued soon. Amendment 4: The disposition of comments was completed for the PDAM 4, and FPDAM 4 will be issued imminently. Amendment 5: A new amendment was started, and PDAM 5 has already been issued. In the consent docket this time, I will be organizing the issues in part by which amendment they are associated with, to help keep things straight. Note that as of current plans, the repertoires for Amendments 3 and 4 together will be the eventual repertoire added for Unicode 5.1. The new repertoire for Amendment 5 will likely be targeted for a future version of Unicode past Version 5.1. Note also that for changes specific to FDAM 3, the UTC really just has to approve things at this point, as there is no chance now to reconsider the decisions by WG2; the FDAM vote is an up or down vote, with no technical changes allowed. ================================================================ A. Latin: Miscellaneous Name Change (FDAM 3) 2C78 LATIN SMALL LETTER E WITH TAIL WG2 accepted a name change to: 2C78 LATIN SMALL LETTER E WITH NOTCH This was the character with the controversy over the term "FINIAL" in the name in the original proposal. UTC accepted "WITH TAIL" as an alternate. The issue was ad hocced in WG2, in response to Irish NB comments, and it was discovered that the feature in common between this letter and others which will eventually be proposed for the Landsmålsalfabet is an inward notch diacritic on rounded portions of the letters. So "WITH NOTCH" sets a better naming precedent for those eventual characters. Suggestion: Approve the revised name. ================================================================ B. Vai: New Characters Added, Plus Block Change (FDAM 3) WG2 accepted two more characters in the Vai block, based on WG2 N3243: A62A VAI SYLLABLE NDOLE MA A62B VAI SYLLABLE NDOLE DO In part because of the possibility of some additional historic characters, and in part because of other rearrangement of the A6XX row, WG2 also extended the Vai block from A500..A62F to A500..A63F. Suggestion: Approve the two new characters and the revised block range for Vai. ================================================================ C. Mirrored Math Arrows (FDAM 3) Based on last-minute review of bidirectional mirroring issues for asymmetric arrow operators with tildes, Asmus Freytag, Barbara Beeton, and Murray Sargent asked (in WG2 N3259) for the addition of 6 more arrow operators to complement the already-approved: 2B41 REVERSE TILDE OPERATOR ABOVE LEFTWARDS ARROW 2B42 LEFTWARDS ARROW ABOVE REVERSE ALMOST EQUAL TO The 6 new characters were approved by WG2 for FDAM 3. Their names and code points are: 2B47 REVERSE TILDE OPERATOR ABOVE RIGHTWARDS ARROW 2B48 RIGHTWARDS ARROW ABOVE REVERSE ALMOST EQUAL TO 2B49 TILDE OPERATOR ABOVE LEFTWARDS ARROW 2B4A LEFTWARDS ARROW ABOVE ALMOST EQUAL TO 2B4B LEFTWARDS ARROW ABOVE REVERSE TILDE OPERATOR 2B4C RIGHTWARDS ARROW ABOVE REVERSE TILDE OPERATOR The rationale is provided in the document. Ordinarily one would expect more review time for such additions, but the argument was that with two of these already going into Amd 3, it was more important to stave off confusion about character and glyph identity by encoding a complete set now, rather than waiting to add them piecemeal later, after problematical choices might have been made in rolling out new math fonts. Suggestion: Approve the 6 new math arrow characters. ================================================================ D. Lanna (FPDAM 4) The UTC has not yet formally approved the encoding for Lanna script. The UTC has reviewed the proposals a number of times, and requested the removal of two characters from PDAM 4, as a result of that review, but was effectively waiting for the results of this WG2 meeting and the ballot disposition of comments to settle on a consensus encoding to give final approval to. Notable changes from PDAM 4 that were the result of the disposition of comments: 1. The two vowel signs AM and TALL AM were removed, as UTC requested. 2. Two new characters were added: 1A29 LANNA LETTER KHUEN HIGH CHA 1AAD LANNA SIGN CAANG 3. Various ranges of characters were moved by a few code points to accomodate the two removals and two additions. 4. Several character names were updated, most notably involving the respelling of "KHUN" as "KHUEN". The other changes involved removing a "LOW" or "HIGH" in a name where no character of contrasting register occurs. 5. At the request of the Chinese NB, the parenthetical alternative name "(Old Tai Lue)" is being added to the block name "Lanna" in Annex A.2.2 in 10646. 6. The font style for the representative glyphs was changed from a Thai-style font to a Khün-style (Myanmar-like) font, again to accomodate the Chinese NB. At this point, all of the issues raised by the U.S., Irish, U.K., and Chinese NB's seem to have been resolved satisfactorily, and I think the script should now be considered stable enough for formal approval. Suggestion: Approve the encoding of Lanna, as documented in WG2 N 3264 (Charts for FPDAM 4, = L2/07-131), with block name "LANNA" and block range 1A20..1AAF. ================================================================ E. CJK: Disunification of U+4039 (FPDAM 4) The proposal to disunify the unified CJK ideograph U+4039 was discussed at great length and in excruciating detail at the WG2 meeting. All of the source reference mapping issues seemed to have been resolved, based on the assumption that a disunification was warranted. The proposal document, WG2 N3196R2 (= L2/07-010) was revised at the meeting to provide more details required for the source mappings and other properties for both the original character and the new character to be disunified from it. See that document for details and justification. WG2 decided on U+9FC3 as the code point for the new character. Suggestion: Approve the disunification, the new character at U+9FC3, and the source mapping and revised property data for both U+4039 and U+9FC3. ================================================================ F. Latin: Capital Letter Sharp S (FPDAM 4) WG2 approved the addition of a capital letter sharp S, based on document WG2 N3227R (= L2/07-108). See the UTC agenda item on this topic and the related feedback documents (L2/07-149, L2/07-156, L2/07-157, ...) The code point and name approved by WG2 for ballot are: U+1E9E LATIN CAPITAL LETTER SHARP S Note that a UTC decision to approve this character at this point should be considered a formal decision to overturn a precedent vote. The UTC discussed this topic in November, 2004, on the basis of an earlier version of the proposal: L2/04-395. (The proposal at that time was requesting a "Capital Double S", but the intent was the same as in L2/07-108.) The UTC decided then: "101-C22 Consensus: The UTC concurs with Stoetzner that Capital Double S is a typographical issue. Therefore the UTC believes it is inappropriate to encode it as a separate character." "101-A74 Action Item for Ken Whistler. Add Capital Double S to the reject list." And the "Capital Double S" (i.e. the capital sharp S) has been on the list of rejected characters since that time. That fact should not prejudice the current decision about the character now under ballot, but I think it does mean that we are dealing with reversing a precedent, rather than simply approving a new character not formerly discussed and rejected. Suggestion: UTC to examine the proposal and feedback documents and decide to approve or not to approve. ================================================================ G. Combining Macrons for Coptic (FPDAM 4) WG2 approved 3 combining marks, intended for use in Coptic text to display macrons across ranges of two or more character. These were approved on the basis of WG2 N3222 (= L2/07-085), but with an addition, with name changes and code point changes. The code points and names approved by WG2 for ballot are: U+FE24 COMBINING MACRON LEFT HALF U+FE25 COMBINING MACRON RIGHT HALF U+FE26 COMBINING CONJOINING MACRON Suggestion: Approve the three new characters, encoded in the Combining Half Marks block. ================================================================ H. Oriya and Malayalam Letters for Vedic (FPDAM 4) The Vedic proposal WG2 N3235R (= L2/07-095) contained much that requires further discussion and updates, but amongst the content there were 4 Oriya and Malayalam dependent vowels needed to complete the set of Vedic Sanskrit vowels, as written in those scripts. WG2 decided to add those 4 vowels to Amd 4: 0B44 ORIYA VOWEL SIGN VOCALIC RR 0B62 ORIYA VOWEL SIGN VOCALIC L 0B63 ORIYA VOWEL SIGN VOCALIC LL 0D63 MALAYALAM VOWEL SIGN VOCALIC LL This follow on, for example, the encoding of similar Malayalam Vedic vowels in Amd 3. Suggestion: Approve these four new characters. ================================================================ I. Old Cyrillic (FPDAM 4) WG2 added all the Old Cyrillic characters approved by the UTC, but in the process of working up the draft amendment documents, the contributing editors noted the possibility for a significantly better arrangement of Old Cyrillic and Cyrillic extensions, by moving Bamum a couple of columns over and coalescing the two Cyrillic extension blocks the UTC had previously approved. Asmus Freytag wrote up the proposed movements and block rearrangements in WG2 N3213 (= L2/07-105). What the UTC approved for Cyrillic extensions: A640..A67F Cyrillic Extended-B block A8E0..A8FF Cyrillic Extended-C block What the revised FPDAM 4 reflects as WG2 approved: A640..A69F Cyrillic Extended-B block Suggestion: Approve the merging of the two blocks, with the attendant change in code points for the additional Cyrillic letters for Abkhaz from A8E0..A8F7 to A680..A697. ================================================================ J. Bamum (PDAM 5) The situation for Bamum (in PDAM 5) reflects the approval of the change in code points for Cyrillic extensions (in FPDAM 4). What the UTC approved: A680..A6DF Bamum What the new PDAM 5 reflects as WG2 approved: A6A0..A6FF Bamum Except for moving the block over two columns, the characters and their names are otherwise unchanged. Suggestion: Approve the revised block and changed code points for Bamum. ================================================================ K. Coptic Additions (PDAM 5) In addition to the special combining macrons for Coptic, WG2 N3222 (= L2/07-085) also requested several other characters for Coptic. These are four cryptogrammic letters and three combining marks found in Coptic manuscripts. Since these particular characters aren't needed with any urgency (as opposed to the combining macrons, which were needed for imminently shipping font implementations for generic Coptic use), WG2 accepted these 7 new characters for Amd 5, rather than accelrating them into Amd 4. 2CEB COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI 2CEC COPTIC SMALL LETTER CRYPTOGRAMMIC SHEI 2CED COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA 2CEE COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA 2CEF COPTIC COMBINING NI ABOVE 2CF0 COPTIC COMBINING SPIRITUS ASPER 2CF1 COPTIC COMBINING SPIRITUS LENIS (Aside: In a commodious vicus of recirculation, the glyphs for 2CF0/2CF1, the Coptic derivatives of the Greek rough and smooth breathing marks, have the glyphs that we *used* to show for 0485/0486, the Cyrillic derivatives of the Greek rough and smooth breathing marks, but which have been subsequently corrected in Unicode 5.0, by and for Cyrillicists, to look more like the Greek rough and smooth breathing marks. *sigh*) Suggestion: Approve the 7 characters for Coptic. ================================================================ L. Egyptian Hieroglyphs (PDAM 5) Finally! 16 years after publication of Unicode 1.0 with Egyptian hieroglyphs on the cover, a proposal for the encoding of the basic set of Egyptian hieroglyphs has advanced to acceptance for ballotting. WG2 approved 1063 basic Egyptian hieroglyphs, based on WG2 N3237 (= L2/07-097, superseding earlier L2 documents on the topic): 13000..13426 Egyptian Hieroglyphs (block: 13000..1342F) Most of the issues for encoding the Egyptian hieroglyphs have been ironed out and have consensus among the participants in drafting the proposal and as reviewed by the community of professional Egyptologists. The one remaining important area of controversy has to do with the representation and encoding of Egyptian numerals. The UTC should review the issue regarding the numerals, but in my personal opinion, the stance taken in the proposal (and approved by WG2 for ballot) is probably the best compromise. Suggestion: Approve the encoding of the Egyptian hieroglyphs characters and block, as shown in the PDAM 5 draft, WG2 N3265 (= L2/07-132). ================================================================ M. Old Hangul Jamo Additions (PDAM 5) After extended discussion through many consecutive WG2 meetings, WG2 finally came to a compromise position to deal with the persistant issue of representation of Old Hangul syllables, as requested by the ROK delegation to WG2. The ROK withdrew all requests for model changes to the representation of Korean, in return for the agreement to encoding of the additional set of 107 Old Hangul complex jamo letters that complete the attested set of Old Hangul jamos. (These, in effect, represent extensions to the already existing set of more common Old Hangul complex jamo letters, and don't actually change the model of representation of Korean at all.) WG2 agreed to the proposed allocation of these 107 jamos as proposed by Ireland in WG2 N3242 (= L2/07-103). That proposal filled out the existing 11XX Hangul Jamo block and then made good use of existing crannies in the BMP in the vicinity of the Hangul Syllables block for the remainder. The details are: In the existing 1100..11FF Hangul Jamo block: 115A..115E Old Hangul initial consonants 11A3..11A7 Old Hangul medial vowels 11FA..11FF Old Hangul final consonants (Those allocations fill the Hangul Jamo block.) A960..A97F Hangul Jamo Extended-A block: A960..A97C Old Hangul initial consonants D7B0..D7FF Hangul Jamo Extended-B block: D7B0..D7C6 Old Hangul medial vowels D7CB..D7FB Old Hangul final consonants Suggestion: Approve the additional 107 Old Hangul jamo characters and the two new block definitions. ================================================================ N. Tai Viet (PDAM 5) Removal of one character, change of script name. The UTC approved the "Tay Viet" script, with corresponding block and character names, based on L2/07-039. WG2 saw a revised proposal, with the script renamed to "Tai Viet", with corresponding block and character names. The revised proposal (WG2 N3220, = L2/07-099) also removed one character AAB2 TAI VIET VOWEL AA WITH CIRCUMFLEX, and moved up the following vowels and tone marks to fill the gap at the position. Suggestion: Approve the revised script and block name, with revised code points and names, as shown in L2/07-099. ================================================================ O. Avestan Separation Point (PDAM 5) The UTC approved the encoding of the Avestan script at the last meeting. WG2 approved Avestan, on the basis of the same proposal in WG2 N3197 (= L2/07-006) for Amd 5, but with one more character approved than the UTC approved. That character is: 10B38 AVESTAN SEPARATION POINT This character was discussed at the last UTC meeting. Suggestion: Discuss again and decide to approve or not to approve encoding this separation point. ================================================================