Public Review Issues

Accumulated Feedback on PRI #361

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Thu Jan 18 13:33:34 CST 2018
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: PRI #361 (UAX #38) feedback

The "Description" fields of the kGB7 and kGB8 properties, "The GB 8565-89 mapping 
for this character in ku/ten form," are identical and probably shouldn't be. I 
suspect the one for kGB7 is incorrect. Also, the GB 8565 standard has three parts, 
and Part 2, which is GB 8565.2 and also dated 88 (1988), is the one that should 
be referenced. The kGB8, and the corresponding G8 prefix of the kIRG_GSource 
property, includes bogus source references for characters that are not included 
in the GB 8565.2-88 standard, particularly those in Rows 8 and 12, and in 
Row 13 starting from value 1351.

Date/Time: Sun Jan 21 00:22:01 CST 2018
Name: Jaemin Chung
Report Type: Error Report
Opt Subject: Wrong descriptions in UAX #38


1.
https://www.unicode.org/reports/tr38/index.html#kGB7 
The description for kGB7 is completely wrong. GB7 is "General Purpose Hanzi List 
for Modern Chinese Language, and General List of Simplified Hanzi" (see G7 of kIRG_GSource.)

2. https://www.unicode.org/reports/tr38/index.html#kGB8 
GB 8565-89 should be GB 8565-88 (or more precisely, GB 8565.2-88).

Feedback above this line was reviewed in the January 2018 UTC meeting.

Date/Time: Fri Jan 26 23:34:46 CST 2018
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: UAX #38 feedback

Although PRI #361 has closed, I found that there is an issue with the
Description of the kIRG_JSource property. While the Syntax covers the JMJ
prefix that is used in Extension F, the Description does not include an
entry for it. I suggest the following:

JMJ Moji Joho Kiban Project (文字情報基盤整備事業)

Date/Time: Tue Feb 6 07:50:41 CST 2018
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: PRI #361 (UAX #38) feedback

All of the comments below are for the kIRG_KSource property, which
represents an effective overhaul:

1) Change the Syntax from K[0-57C]-[0-9A-F]{4,5} to
K([0-5]-[0-9A-F]{4}|C-[0-9]{5}) to more accurately match the actual source
prefixes and their associated property values (note that the "C7-" source
prefix is not used).

2) Remove "in hex" from "The IRG “K” source mapping for this character in
hex." in the Description field, or change it to "in hexadecimal or decimal,"
because the source references that are associated with the "KC-" prefix use
decimal values, not hexadecimal.

3) According to the latest KS standards, the following entries in the
Description field need to change from:

K2 PKS C 5700-1 1994
K3 PKS C 5700-2 1994
K4 PKS 5700-3:1998
K5 Korean IRG Hanja Character Set 5th Edition: 2001

to the following:

K2 KS X 1027-1:2011 (formerly PKS C 5700-1 1994)
K3 KS X 1027-2:2011 (formerly PKS C 5700-2 1994)
K4 KS X 1027-3:2011 (formerly PKS 5700-3:1998)
K5 KS X 1027-4:2011 (formerly Korean IRG Hanja Character Set 5th Edition: 2001)

The above change is per IRG N1901 (IRG #39), and is already reflected in
Section 23.1 of ISO/IEC 10646:
http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg39/IRGN1901_upd_22.1_23.1.pdf 

And, I also confirmed the existence of these four KS X 1027 standards on the
KS standards website (and ordered them for good measure):

http://www.kssn.net/stdks/KS_detail.asp?k1=X&k2=1027-1&k3=1 
http://www.kssn.net/stdks/KS_detail.asp?k1=X&k2=1027-2&k3=1 
http://www.kssn.net/stdks/KS_detail.asp?k1=X&k2=1027-3&k3=1 
http://www.kssn.net/stdks/KS_detail.asp?k1=X&k2=1027-4&k3=1 

As a side note, the number of Unihan Database entries for the K2 through K5
source prefixes perfectly matches the number of actual ideographs in each of
the corresponding standards:

K2 = KS X 1027-1:2011 = 7,911
K3 = KS X 1027-2:2011 = 1,834
K4 = KS X 1027-3:2011 = 172
K5 = KS X 1027-4:2011 = 404

4) Add the following source prefix to the Description field (Section 23.1 of
ISO/IEC 10646 already reflects this):

KC Korean History On-Line (한국 역사 정보 통합 시스템)

5) Change the last paragraph in the Description field to the following:

Note that the K4 and K5 sources are expressed in hexadecimal, but unlike the
K0 through K3 sources, they are not organized in row/column format. Also
note that the KC source is expressed as a zero-padded five-digit decimal
value.

That is all.

Date/Time: Tue Feb 20 11:01:30 CST 2018
Name: Henry Chan
Report Type: Error Report
Opt Subject: HB0-A2CD should be removed from 卄 (U+5344)

In IRGN2080 IRG#44 Meeting Recommendations M44.7:

Recommendation IRG M44.7: Remapping of H Source Ideograph and Suzhou
Numerals (IRGN2077 and Hong Kong SARG response):

The IRG agreed to re-map the BIG5 Suzhou Numerals, HB0-A2CD to U+3039 (and
thus U+ code mapping from U+5344 to U+3039), and to map HB-A2CC to U+3038
and HB-A2CE to U+303A. The IRG agrees that H-9B4C should not be remapped
from U+69E9 to U+3BA3.

Action for M44.7: IRG Rapporteur is requested to forward the source mapping
change to ISO/IEC 10646 Project Editor for further action/record.

Since HB0-A2CD has been moved to U+3039, the HB0-A2CD glyph should be
removed from U+5344.

Background:

HB0-A2CD is Big5 Suzhou Numeral Twenty.  At the time of making the HB0
mappings the Suzhou Numerals Ten, Twenty and Thirty were not yet encoded.
By chance the Suzhou Numeral Twenty was unified to U+5344 by error and
Numeral Ten and Thirty were dropped.  Upon the confirmation of the Hong Kong
CLIAC, they agreed to remap the Big5 Suzhou Numerals to the proper symbols
already encoded, similar to what has been done by TCA.  Thus, HB0-A2CD
should not belong to U+5344 and the glyph and source identifier should be
removed.

Date/Time: Tue Mar 13 22:36:40 CDT 2018
Name: wqzrh
Report Type: Error Report
Opt Subject: U+8BC6（识）、U+8BBD（讽）音错误

#
# Unihan_Readings.txt
# Date: 2018-02-14 22:41:22 GMT [JHJ]
# Unicode version: 11.0.0
============================================
U+8BC6	kHanyuPinlu	shi(908) shí(267)
U+8BC6	kMandarin	shì【注：这个音错误，应为“shí”】
U+8BC6	kXHC1983	1037.010:shí 1492.080:zhì
-------
U+8BBD	kHanyuPinlu	fěng(14)
U+8BBD	kMandarin	fèng【注：这个音错误，应为“fěng”】
U+8BBD	kXHC1983	0331.080:fěng
-------

Date/Time: Thu Mar 22 21:03:18 CDT 2018
Name: H.W. HO
Report Type: Error Report
Opt Subject: U+9FEC


The T-source of U+9FEC should be T4-6E5D instead of T4-65ED. 
See http://www.cns11643.gov.tw/AIDB/news_view.do?sn=de and 
http://www.cns11643.gov.tw/AIDB/query_general_view.do?page=4&code=6E5D

Date/Time: Fri Mar 30 16:37:24 CDT 2018
Name: Mo
Report Type: Other Question, Problem, or Feedback
Opt Subject: UNIHAN DATA

(Note: this has been updated by the editor in the Unihan DB.)

The Chinese character with the unicode of "U+2B6AD" has no information at all.
This Chinese character is the simplified form of the character with the unicode of "U+9C72".
I think information can be added from this character.

This is only a sample that I noticed. There might be others too.

You are doing a great job. Good luck!

Date/Time: Sat Apr 14 01:17:47 CDT 2018
Name: Jim Breen
Report Type: Public Review Issue
Opt Subject: Unicode Han Database (Unihan)

(Note: this has already been passed to the editors for follow-up.)

Sorry to be slow responding to the Proposed Update.

Two matters:

A. Thanks for continuing to carry a link to the EDICT/etc. project,
but the URL used is long out-of-date. The current version of that page
is at: http://www.edrdg.org/jmdict/edict.html

B. Have you considered having the reference numbers/codes for some of the
other major kanji dictionaries in addition to the New Nelson? I think for many
students of Japanese it would be good to have references to both Jack Halpern's
New Japanese-English Character Dictionary and Spahn & Hadamitzky's The Kanji
Dictionary. In the case of the NJECD you could have either or both of the
character sequence numbers or the SKIP code.

I could assist in providing these codes/numbers.

Best wishes

Jim Breen