L2/08-204 Title: Quick Report on WG2's Meeting Last Week Source: Ken Whistler Date: April 30, 2008 WG2 met last week, April 21 - 25, in Redmond, Washington, at a meeting hosted by Microsoft. For those of you on the list who weren't present and who haven't been actively spelunking through the WG2 document register looking for clues, I thought I would give a quick overview of what took place and the outcome in terms of new amendment progression for 10646. The meeting had national body participation by Canada, China, Ireland, Japan, the Republic of Korea, Poland, U.K., and the USA. It had liaison participation by SEI - UC Berkeley, the Taipei Computer Association, and the Unicode Consortium. China, in particular, brought quite a few experts, because of their ongoing concerns about pushing forward encodings for minority and historic scripts of China: Old Turkic, Nushu, Tangut, and Lisu all got into ballot, and there was also an expert on Classical Yi who came to work on that script with other experts. The high-level overview is that WG2 progressed Amd 5 and Amd 6, and didn't start any other new amendment. For a quick visual overview of where things stand for character repertoire additions, you can refer to the Unicode Pipeline page, which has now been updated with the latest results from WG2: http://www.unicode.org/alloc/Pipeline.html Principles and Procedures WG2 agreed to incorporate material into its Principles and Procedures document based on two UTC/L2 contributions. The first had to do with writing in more details to ensure that all stakeholders for newly encoded scripts get consulted early on in the process. The second was the set of recommendations about dandas for various already encoded and yet-to-be-encoded Brahmi-derived scripts. Amendment 5 Amd 5 had is disposition of comments for the FPDAM ballot, and will now be moving on to its (non-technical) FDAM ballot. Basically, this one is now done. There were a number of difficult issues to resolve for Amd 5, but it amounted to wrangling around the edges. One involved a handful (3) of additional CJK compatibility characters for mapping to the ARIB standard. Those stayed in. Then there were some complications about how to reference Unicode and references to the Ideographic Variation Database, and some details about editing the text and glyphs for Korean. A few characters were removed from Amd 5 for future study. A number of glyph fixes were approved, and those will be applied immediately, by being published in Amd 5, rather than waiting for Amd 6. Amd 5 now goes to the FDAM ballot with the following repertoire additions: 1. More Old Hangul jamo letters, to complete that set 2. Tai Tham script 3. Tai Viet script 4. Avestan script 5. Egyptian hieroglyphics (Gardiner set and some extensions) 6. CJK Extension C (4149 more CJK unified ideographs) 7. A handful of miscellaneous character additions Amendment 6 Amd 6 had its disposition of comments on its PDAM ballot. If you recall, PDAM 6 had a rather limited set of new repertoire content, consisting mostly of 3 historic scripts (Imperial Aramaic, Inscriptional Parthian, and Inscriptional Pahlavi), and then a scattering of additional miscellaneous characters. Because there were a large number of additional proposals ready for ballot, WG2 decided to consolidate them into Amd 6 and simply reballot Amd 6 as a PDAM, rather than going through the extra work of balloting Amd 6 as an FPDAM and putting all the new stuff into a new amendment to ballot as a PDAM -- which would just have created additional work for everybody. Because the danda recommendations had been picked up by WG2, it was then possible to move Meitei Mayek (now to be spelled Meetei Mayek) out of its on hold status and back to ballot. Dandas were added, and the script was reordered based on feedback from Manipur. It is now in Amd 6. WG2 then added a number of scripts to Amd 6 that had already been reviewed and approved by the UTC: Javanese, Samaritan, Lisu, Kaithi, Old South Arabian, and Tangut. There was a fair amount of controversy about the addition of Tangut, which mostly revolved around the issue of whether to hold off until Tangut radicals could all be identified and encoded, along with the repertoire of ideographs, and how that might impact the decision about the order of the large main repertoire (5910 ideographs). In the end, the position of the U.S. and China to proceed with balloting the set of 5910 ideographs, and to let the discussion of the potential encoding of radicals continue and be resolved later, prevailed, and so Tangut was added to Amd 6. WG2 also added the Vedic extensions that the UTC has been reviewing for so long to Amd 6. The one exception was the prishthamatra e, which was withheld from ballot for now, pending further input from the Government of India and further technical discussions about it. WG2 also added miscellaneous characters that had been approved by the UTC for Unicode, including the Rumi numeral symbols, the livre tournois sign, the decimal exponent symbol, and a handful of others. WG2 agreed to add the 289 Tamil named sequences to Amd 6. There were also a number of additions to Amd 6 of sets of characters that the UTC has not yet approved. These included 18 characters added to the Myanmar script for minority language support, and 39 characters added to the Unified Canadian Aboriginal Syllabics. WG2 also decided to proceed to add 389 characters for Nushu (a local women's writing system in China). That addition was somewhat controversial, as there are still some questions about the makeup of the repertoire and mapping issues between sources. So it will need to be carefully reviewed during the ballot period. WG2 added the Old Turkic runes. The UTC has reviewed various iterations of that proposal in the past. Finally, WG2 agreed to add 186 Japanese TV symbols, for interoperability with the ARIB standard. This is also a proposal that the UTC has worked on for several cycles -- but it was further modified during the WG2 meeting, and will also need careful review during the ballot for Amd 6. Amd 6 now goes to a second PDAM ballot with the following repertoire additions: 1. Various additions for Vedic texts 2. Myanmar extensions for minority languages 3. UCAS extensions for several languages 4. Meetei Mayek script 5. Lisu script 6. Javanese script 7. Imperial Aramaic script 8. Old South Arabian script 9. Inscriptional Parthian script 10. Inscriptional Pahlavi script 11. Old Turkic script (runes) 12. Kaithi script 13. Tangut script 14. Nushu script 15. 186 symbols for interoperability with the Japanese ARIB TV standard 16. Miscellaneous other character additions Of potential interest: the WG2 resolutions were all passed unanimously, without even any abstentions. I consider this to indicate that despite the occasional sharp disagreements over some details, WG2 is still proceeding in its work on 10646 with a great deal of consensus among the national bodies participating -- regarding both the standard itself and what kinds of changes should be made to it. --Ken