L2/01-266 From: John H. Jenkins [jenkins@apple.com] Sent: Monday, June 25, 2001 12:23 PM Subject: Report on IRG 17 The 17th meeting of the Ideographic Rapporteur Group was held in the Hong Kong SAR, 18-22 June 2001. I was in attendance representing the Unicode Consortium and Hideki Hiura of Sun representing L2. The meeting was very well attended. Representatives were present from the People's Republic of China, Taiwan, the Hong Kong SAR, the Macau SAR, Japan, North Korea, South Korea, and Vietnam. Singapore was the only IRG member not represented. There were three main items of business. 1) The final tweaks needed to be made to the glyphs for Extension B. This occupied the bulk of the attention for most of the participants for most of the meeting. 2) The DPRK (North Korea) was most anxious to make sure that the mappings between their two standards (KPS 9566-97 and KPS 10721-2000) and ISO/IEC 10646-2 and -2 were adopted into Extension B. They have submitted mappings in the past, but the quality of those mappings has been questioned and they submitted new mappings for this meeting. I was assigned to check the quality of their mapping data. There were limited checks that I could perform in the time allotted during the meeting. I was initially able to identify over two thousand duplicate mappings (i.e., cases where they mapped the same character to two different Unicode characters), but they were caused by a programming glitch and were corrected. Beyond that, I tried my best to make it clear to them that these mappings are not easily changed once they become part of the standard, and that they should be willing to accept the overall responsibility for the quality of their mapping data accordingly. They were fine with that. Once we had a set of consistent mappings, therefore, they were adopted by the IRG. I have already added them to the Unihan database and they will be included in the beta release of the 3.1.1 data later this week. 3) Extension C. Submissions have been received from various IRG members. Most of them are relatively small. South Korea, however, has been scouring the tripitaka for new ideographs and have identified some 47,000 (!) ideographs that they want to submit to Extension C. The current estimate for the number of ideographs that will be included in Extension C is around 67,000. Some principles were worked out for what should be included with Extension C submissions. Hideki and I are in agreement that the bulk of the Extension C characters could be eliminated by having Ideographic Variation Selectors added to the standard. It would be really nice if we could have a UTC resolution on the IVS characters in the next UTC meeting to take to Singapore and WG2. Beyond this, there were other matters that were raised by Unicode. 1) There is no official JIS X 0213/Unicode mapping table. Vendors are on their own to create them for the moment. Both IBM and Apple have fairly complete tables available. I'll try to get the Apple table cleaned up and distributed in the next little while. 2) Thomas Chan had asked us to forward to the IRG a longish list of questions, mostly regarding whether or not particular dictionaries would be used as sources for Extension C. I'm supposed to write him back and let him know that the IRG basically took no action owing to a lack of time, but if he knows of specific characters missing from any of these specific dictionaries, it should be possible to include them in the standard. A couple of other items of interest: 1) On the last day of the meeting, Mr. Zhang pointed out that various people are asking for additional data about ideographs coming out of the IRG, things like pronunciations, definitions, and so on. He felt that the IRG should take up the work of providing such information in the future. I raised the point that if people want that kind of data, they can *already* get it from Unicode, and we would be happy to donate our Unihan database to the IRG as a starting point. 2) There was a need expressed for a formal IRG mailing list. I, having full confidence in Saravasti's powers, volunteered Unicode to host it. A full set of the IRG's resolutions in the form of a Word document is attached. -- ===== John H. Jenkins jenkins@apple.com jenkins@mac.com http://homepage.mac.com/jenkins/