Subject: CJK Ext-A fix
Date: Tue, 22 Jan 2013 20:41:04 +0000
From: Michel Suignard

Please add this message to agenda for this UTC.

This could affect 3 G sources for Ext-A characters. This is potentially a Unicode 6.3 issue. Clearly the G source for 3828 is erroneous.


From: Michel Suignard
Sent: Monday, December 03, 2012 11:08 AM
To: 'chen-zhuang'
Cc: csluqin
Subject: RE: Correction of 3 G sources

Dear Chen Zhuang,

This needs a WG2 document with some explanation. I spent sometime this morning deciphering what went on here.

Originally the GHZ sources were added in Extension A w/o numeric references. I verified that 10646-2003 had none (just said GHZ). They were added later using data provided by the Unihan database. The only GHZ that had no Unihan GHZ numeric value value was 3ABF which explains why it was left as is and was dubious. Removing that value is probably OK. Note that every Microsoft font supporting China have had that character for a while, so to some degree it has become self-referenced. Font vendors use the presence of an IRG source to add the corresponding glyph in their local font so there is always a small regression risk in removing existing source. So I am wondering whether we should add a new G ‘virtual’ source to keep the G reference alive.

Concerning 3828, the issue is duplicate GHZ source between that character and 21FE2. Both are RS 46.25, Kangxi 0323.161 and for now GHZ 10810.02. The glyphs are very different. Evidences seem to suggest that original Unihan GHZ data concerning 3828 was in error and should have been 10810.03 instead of 10810.02. From a WG2/10646 point of view I think that the G source change for 3828 is not controversial (Unihan has still to fix its kIRGHanyuDaZidian field).

Concerning 400B, the issue is duplicate GHZ source between that character and 2A279. However they have different RS (108.16 versus 197.10) and Kangxi (0798.171 versus 1507.311). But the only visual difference between the two glyphs is that the low ‘Dish’ component is either spanning the whole cell or the second half. First we have to make sure that the dis-unification reflected in the two code points (400B and 2A279) is genuine. If it is, then the GHZ source for 400B should be removed with still the pending issue whether or not we should keep a virtual G source for the same reason as for 3ABF. If the dis-unification was not correct, we have to determine which of the two is the genuine code point and deprecate the other one (using the UCI notation describing characters with no identified source reference).

Depending on the urgency this could be reflected in Amendment 2 (currently under DAM ballot) or the 4th edition (currently under CD ballot). Using the 4th edition is easier. This means that along with the WG2 document, China should consider making a comment concerning this in either ballot.

Best regards,


From: chen-zhuang
Sent: Monday, December 03, 2012 2:15 AM
To: Michel Suignard
Cc: csluqin
Subject: Correction of 3 G sources

Dear Michel,

According to IRG Resolusion M39.2, I'm requesting you to correct 3 G source references in UCS.

Resolution IRG M39.2: Editorial issue on CJK Unified Ideographs (IRGN1884, IRGN1896)
Action: China
The IRG requests China to report to WG2 about the deletion and modification of the G-source reference of characters agreed in IRGN1896.

G source of U+03828 should be changed from GHZ-10810.02 to GHZ-10810.03 according to Hanyu Da Zidian (漢語大字典)
G sources of U+0400B and U+03ABF 㪿 should be deleted because the real G sources are not found so far.

Should I prepapare a document to WG2?

Best regards,

Chen Zhuang