Accumulated Feedback on PRI #534

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Tue Oct 21 01:01:22 PT 2025
ReportID: ID20251021010122
Name: Eiso Chan
Report Type: Report Error in Publication/Data
Opt Subject: UTC-00733

The current information is below.

UTC-00733;Variant;U+81A5;130.11;;⿰未⿱成肉;kCheungBauerIndex 453.09;;;

But, this one should be mapped to U+31F1A 𱼚 in CJK Ext. H.

Date/Time: Wed Nov 12 19:50:42 PT 2025
ReportID: ID20251112195042
Name: Eiso Chan
Report Type: Report Error in Publication/Data
Opt Subject: G-Source references for U+3A36 㨶 and U+6440 摀

As what I wrote for WS2024-01574 on the IRG WS2024 ORT (https://hc.jsecs.org/irg/ws2024/app/?id=01574). 
I suggest update the G-Source references as below and the G glyphs should be kept as-is.

U+3A36 kIRG_GSource GKX-0450.19
U+6440 kIRG_GSource G5-4B63

The kGB5 property value under U+3A36 㨶 should be removed and move the corresponding value to U+6440 摀

U+6440 kGB5 4367

Date/Time: Wed Nov 19 09:31:17 PT 2025
ReportID: ID20251119093117
Name: Ken Lunde
Report Type: Report Error in Publication/Data
Opt Subject: kTotalStrokes property value changes

The informative kTotalStrokes property values of the following two ideographs, which are also used as components 
of other ideographs, should be changed as follows to conform to IRG stroke-counting conventions:

U+4E9F kTotalStrokes 9 (currently 8)

U+268DE kTotalStrokes 7 (currently 6)

Date/Time: Tue Dec 09 08:59:46 PT 2025
ReportID: ID20251209085946
Name: Ken Lunde
Report Type: PRI Feedback
Opt Subject: PRI 534

I received a report that five kMandarin property values in the Extension I block that were added in Unicode Version 
17.0 are incorrect, which I was able to trace to a one-off error on pp 2 and 3 of document L2/25-029 in which the 
CLDR-TC responded to proposed kMandarin property values in document L2/23-209. What caused this issue is that the 
code point in the referenced URL in the eighth (and final) column is one higher than the code point in the first column, 
and the glyph in the second column is for the character that is one code point higher than that in the first column.

The CLDR-TC confirmed the errors and the following corrections on 2025-12-08:

U+2EC3D kMandarin yuè (currently lǎn)
U+2EC3E kMandarin lǎn (currently qiè)
U+2EC3F kMandarin qiè (currently xíng)
U+2EC41 kMandarin jìn (currently yǔ)
U+2EC52 kMandarin niǎo (currently yàn)

Date/Time: Tue Dec 23 12:52:37 PT 2025
ReportID: ID20251223125237
Name: Llinos Evans
Report Type: PRI Feedback
Opt Subject: kTang information

The data in kTang does not seem sufficient;

The Tang dynasty pronunciation(s) of this ideograph, derived from or consistent with T’ang Poetic Vocabulary by Hugh 
M. Stimson, Far Eastern Publications, Yale University 1976. An asterisk indicates that the word or morpheme represented 
in toto or in part by the given ideograph with the given reading occurs more than four times in the seven hundred poems covered.

Specifically, it does not denote where the pronunciations come from in that text. Are they Pulleyblank/Karlgren/etc 
reconstructions? Is it based on the Qieyun, Guangyun, etc? It would be extremely handy if this were noted if only to 
make it a little more comprehensive, otherwise it isn't very useful to an outsider like myself.

I did try to track down the book but the only listing I could find is £100 and it isn't on any shadow libraries.

Mind lending a hand?


Feedback above this line has already been reviewed during UTC #186 in January, 2026.

Date/Time: Wed Jan 21 21:55:18 PT 2026
ReportID: ID20260121215518
Name: Maximilien Mellen
Report Type: PRI Feedback
Opt Subject: Error in kFourCornerCode for U+71AF (熯)

I would like to report what I believe is an error in the Unihan database property kFourCornerCode for U+71AF (熯).

Current Value: 9403.4
Expected Value: 9483.4

Reasoning:

Left Component: I note that U+706F (灯) is currently listed as 9182.0. Based on this, I would expect any character 
with the Fire radical on the left to follow the pattern 9_8_._. I believe the third digit 0 in the current value 
for 熯 is an error, as the bottom of the Fire radical is typically encoded as 8.

Right Component: The right-hand side is identical to U+6F22 (漢), which is encoded as 3413.4. Therefore, I believe 
熯 should logically match the pattern _4_3.4.

Reference: My understanding is supported by Shirakawa Shizuka's dictionary Jitsū (字通), accessed via JapanKnowledge.com, 
which lists it under the code 9483.

Thank you for your time and for reviewing this correction.

Date/Time: Sat Jan 31 17:32:47 PT 2026
ReportID: ID20260131173247
Name: Eiso Chan
Report Type: PRI Feedback
Opt Subject: kRSUnicode value for U+2A7EE


The current kRSUnicode value for U+2A7EE is 22.5, which is derived from the Chinese surname 欧阳/歐陽. The radical of
 区 and 區 are both R23. The corresponding traditional form (WS2024-00318:GXM-00530 ⿰區昜) has been included in IRG 
 WS2024 project, and the radical has been confirmed as R23.

Therefore, I suggest updating the kRSUnicode value for U+2A7EE as 23.5.

Date/Time: Thur Feb 05 12:16:05 PT 2026
ReportID: ID20260205121605
Name: Michel Mariani
Report Type: PRI Feedback
Opt Subject: PRI 534 - kSemanticVariant issues


- In the `Unihan_Variants.txt` data file for Unicode 17.0, the following entries contain "asymmetrical" `kSemanticVariant` properties; 
normally, if A is a semantic variant of B, then it is assumed that B is also a semantic variant of A:

U+34C1    kSemanticVariant    U+7F51
- This can be fixed by adding these new entries:


U+3F16    kSemanticVariant    U+2CEAC
U+4E0C    kSemanticVariant    U+2CEA2
U+50CF    kSemanticVariant    U+2CEA8
U+5149    kSemanticVariant    U+2CEA7
U+5152    kSemanticVariant    U+20487
U+5353    kSemanticVariant    U+2CEA5
U+5C35    kSemanticVariant    U+2EC63
U+5E72    kSemanticVariant    U+2CEAD
U+5E79    kSemanticVariant    U+2CEAD
U+65E6    kSemanticVariant    U+2CEA3
U+6EDA    kSemanticVariant    U+2CEA4
U+6EFE    kSemanticVariant    U+2CEA4
U+7217    kSemanticVariant    U+244B9
U+74E5    kSemanticVariant    U+2CEAC
U+78D9    kSemanticVariant    U+2CEA4
U+8B5C    kSemanticVariant    U+8AE9
U+8C61    kSemanticVariant    U+2CEA8
U+8DB3    kSemanticVariant    U+2CEA3
U+24B24    kSemanticVariant    U+2CEAC
U+2DE30    kSemanticVariant    U+2CEAA
 
- And also update (extend) these entries to:

 
U+5B2A    kSemanticVariant    U+2174F U+21D13
- Note that additional data separated from the added code points by a less-than sign (<) may be needed, but this would require 
appropriate linguistic expertise then...


- As a side note, be warned that the contact form has been replacing all original tabulation characters by spaces in this report...

Date/Time: Fri Feb 06 12:10:43 PT 2026
ReportID: ID20260206121043
Name: Michel Mariani
Report Type: PRI Feedback
Opt Subject: PRI 534 - "self-variant" issues in Unihan DB


In the Unihan_Variants.txt data file, only two entries show unexpected "self-variants":

• U+2D016 𭀖

U+2D016-self-variant.png

U+2D016    kSemanticVariant    U+5398
should probably be changed to:

U+2D016    kSemanticVariant    U+5398
• U+7274 牴

U+7274-self-variant.png

U+7274    kSpecializedSemanticVariant    U+7274 U+89DD
should probably be changed to:

U+7274    kSpecializedSemanticVariant    U+89DD
or, perhaps, the variant U+62B5 抵 was intended instead of U+7274 牴, in which case more changes would be needed across several entries...

Date/Time: Thu Feb 12 21:23:16 PT 2026
ReportID: ID20260212212316
Name: John Wilcock
Report Type: PRI Feedback
Opt Subject: PRI 534 - Mismatches between delimiter and UCD for UAX38

Here is a list of properties for where the delimiter attribute is listed as space, but the property only contains a single value. In some cases, this might be desirable for future extensibility, but in other cases, such as kKarlgren, the property is only single-valued and will not change.

kOtherNumeric
kTayNumeric
kIBMJapan
kIICore
kIRGDaeJaweon
kIRGHanyuDaZidian
kIRGKangXi
kJIS0213
kJis0
kJis1
kKangXi
kKarlgren
kMainlandTelegraph
kMatthews
kMorohashi
kTaiwanTelegraph
kXerox
kZVariant

Date/Time: Tue Feb 17 19:45:01 PT 2026
ReportID: ID20260217194501
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

UTC-00882 can be horizontally extended to U+32562, and UTC-00891 can be horizontally extended to U+327F2.

Date/Time: Wed Feb 18 19:17:47 PT 2026
ReportID: ID20260218191747
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

UTC-00531's “General comments” field in USourceData.txt states that it is “Unifiable with U+51E8”, while UTC-00595's general comments field states 
that “U+36EE is a z-variant”. Does this imply that UTC-00531 and UTC-00595 could be horizontally extended to U+51E8 and U+36EE, respectively? And 
if not, would that mean that those fields would need amending?

Date/Time: Sun Mar 01 08:56:27 PT 2026
ReportID: ID20260301085627
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

Due to J-source homoglyphy, adding the following might be appropriate.

U+4DB9 kSpoofingVariant U+5C6E

U+5C6E kSpoofingVariant U+4DB9

Date/Time: Fri Mar 06 03:20:16 PT 2026
ReportID: ID20260306032016
Name: Judith Chen
Report Type: PRI Feedback
Opt Subject: PRI 534

Per IRG N2826 and L2/26-009, the G-source representative glyphs for U+211AE and U+2B536 have been updated using the font provided in IRG N2844R. 
I suggest also updating their source reference values to GHZR-10766.06 and GGFZ-026700, as proposed in IRG N2843, for these values better align 
with the new glyphs.

Date/Time: Mon Mar 16 18:29:57 PT 2026
ReportID: ID20260316182957
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

The “Property Types § Radical-Stroke Counts” section (section 3.6) lists both `kRSAdobe_Japan1_6` and `kRSUnicode`. Only `kRSAdobe_Japan1_6` is 
actually in the `RadicalStrokeCounts` datafile: `kRSUnicode` is in `IRGSources` (presumably because it's a constituent datafile of ISO10646?).

`kCheungBauer` provides radical-stroke counts and other info, and is in `DictionaryLikeData`. `kRSAdobe_Japan1_6` also provides Adobe-Japan1 
CIDs, analogous to `kMojiJoho` in `DictionaryLikeData`. `kRSJapanese`, `kRSKanWa`, `kRSKorean` and `kRSKangXi` have been removed.

`RadicalStrokeCounts` is currently the only Unihan datafile with only one property, and the second‑smallest. Section 4.3 (“Listing by Location 
within Unihan.zip”) specifically disclaims that which properties appear in which Unihan file is not stablised. Section 2.2 (“Unihan.zip”) states 
the same.

Question 1: is having a separate `RadicalStrokeCounts` datafile which contains only `kRSAdobe_Japan1_6` (and does not include `kRSUnicode`, 
presumably for ISO10646-related reasons as mentioned?) potentially causing more confusion than utility?

Question 2: is a `kRSMojiJoho` property worth proposing, since the Moji Jōhō Kiban does contain such data, and they sometimes differ by IVS, 
e.g. `74.8` for `U+440B+E0100` and `130.8` for `U+440B+E0101`?

Date/Time: Tue Mar 17 20:08:12 PT 2026
ReportID: ID20260317200812
Name: Ken Lunde
Report Type: PRI Feedback
Opt Subject: PRI #534 Feedback for U+2789F

To match the discussion record for IRG Working Set 2024 #4670, the kRSUnicode property value of U+2789F should be changed from:

147.15

to:

211.7 147.15

The original radical should be kept, because it was used for sorting the Extension B block.
20260318141943

Date/Time: Wed Mar 18 14:19:43 PT 2026
ReportID: ID20260318141943
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

The line for UK-01430 in USourceData.txt notes that it was resubmitted as WS2021 as UK-20437.

UK-20437 is now encoded at U+32F6B.

Presumably, its status could be changed to ExtJ, although I do not know what that would imply for its source-references. 
Maybe it could be horizontally extended as UTC-01430, or maybe its source-references ought to be left as they are. In any 
case, the codepoint U+32F6B could be added to the UK-01430 line in USourceData.txt

Date/Time: Sat Mar 21 18:04:58 PT 2026
ReportID: ID20260321180458
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534

UTC-00492 can be horizontally extended to U+3B6E.

UTC-00477 matches two of U+51A4's reference glyphs, and could be horizontally extended.


Some possible IDS sequences for some entries which lack one in USourceData.txt:

UTC-00086: ⿳丷⿻𫩏丷⿻㇂㇒

UTC-00147: ⿹⿱㇋丶凵

UTC-00148: 〾民

UTC-00149: 〾⿻尸⿻丿一

UTC-00158: ⿳廿串𱎛

UTC-00282: ⿹入方

UTC-00288: 〾⿻𡆵⿱丷㇒

UTC-00336: ⿸归一

UTC-00487: ⿷乂丶

UTC-00813: 〾⿱𦥔木


Probably an appropriate addition considering how similar their glyphs are:

U+5F56 kSpoofingVariant U+27C32

U+27C32 kSpoofingVariant U+5F56

Maybe also level-2 UCV candidates: UTC-00133 and UTC-00155 are very similar to U+6B1A and U+25BF5.


The following 17 codepoints could be horizontally extended with existing U-source references.

Most of these already have a kSBGY property matching the value from USourceData.txt.

U+4E12: UTC-00128 (compare UCV #103)

U+6BA9: UTC-00166 (matches G-glyph; kSBGY)

U+748F: UTC-00178 (UCV #373; kSBGY; IDS sequence needs correction to ⿰王⿱⺕⿲土矢匕)

U+206F1: UTC-00127 (analogous to unifications at U+62F6)

U+2263F: UTC-00283 (UCV #96; kSBGY)

U+228D9: UTC-00269 (UCV #252; kSBGY)

U+2354D: UTC-00164 (UCV #305; kSBGY)

U+23742: UTC-00228 (UCV #441; kSBGY)

U+250A9: UTC-00221 (UCV #390; kSBGY)

U+26087: UTC-00184 (UCV #194; kSBGY)

U+26B83: UTC-00212 (analogous to UCV #440; kSBGY)

U+26C2B: UTC-00171 (UCV #307c; kSBGY)

U+27B93: UTC-00202 (UCV #441; kSBGY)

U+27E29: UTC-00204 (UCV #306; kSBGY)

U+2A4E8: UTC-00157 (UCV #355)

U+2E0A0: UTC-00238 (matches G-glyph; closer than U+79B6 despite kSBGY)

U+3011A: UTC-00259 (matches G-glyph; closer than U+206A5 despite kSBGY; G5-342E vs G5-337C)


Some newly-possible U-source horizontal extensions arising from IRGN2910 Appendix B entry 1.7.

Some of these already have a kSBGY property matching the value from USourceData.txt.

U+6550: UTC-00206

U+22FAB: UTC-00195 (already has matching kSBGY)

U+23037: UTC-00216

U+2837F: UTC-00205 (already has matching kSBGY and matching J-glyph)

U+2D8EF: UTC-00162 (closer than U+22F0B despite kSBGY)

Also, UTC-00188 could be given Variant status for U+3A9C (matching kSBGY, but already UTC-03133).

Date/Time: Sat Mar 28 17:32:09 PT 2026
ReportID: ID20260328173209
Name: Harriet Riddle
Report Type: PRI Feedback
Opt Subject: PRI 534


Some consolidated observations from analysis of some more entries in the U-source ideographs database, mostly those with NoAction status 
and no related codepoint listed:

---

UTC-00607 is identical to UK-01490 (U+30BA4); UK-01490 is also in the U-source database, so UTC-00607's status could be changed from 
Rejected to UK-01490.

---

UTC-03076 has Rejected status, presumably because it's a variant of UTC-03075? (The two even have the same IDS sequence listed, although 
⿰义页 might be a more accurate IDS for UTC-03076.) If so, its status could be changed to Variant with reference to U+322B9, now that 
UTC-03075 has been encoded.

UTC-00088's kHanYu value points to a virtual position after U+6291's entry, and the UTC-00088 glyph closely resembles one of the ancient 
script forms shown in that HDZ entry. So UTC-00088 could be changed to Variant status with reference to U+6291 (its IDS could also be 
simplified to ⿰扌𢑏, with U+2244F).

UTC-00687's kHanYu value of 20920.050 is one of U+5B9C's two kHanYu values, so its status could be changed to Variant with reference to U+5B9C.

---

The following U-source ideographs which contain U+26E4C as a component could have their status changed from NoAction to Variant with 
reference to an existing Unicode character containing U+8806 or U+27363:

UTC-00138 (U+24EF9 variant); UTC-00141 (U+21FCB variant); UTC-00319 (U+2865A variant)

---

kSBGY's documentation (in UAX #38) notes that the mappings to Unicode codepoints are sometimes approximate (i.e. potentially to a close variant) 
in cases where the exact form doesn't exist in Unicode. Thus, the following U-source ideographs could have their status changed from NoAction 
to Variant based on which existing (non-unifiable) Unicode character they share a kSBGY value with (which might be the only kSBGY value of the 
existing codepoint, or it might be one of multiple):

UTC-00176, UTC-00177, UTC-00179, UTC-00180, UTC-00187, UTC-00190, UTC-00201, UTC-00207, UTC-00209, UTC-00211, UTC-00218, UTC-00219, UTC-00222, 
UTC-00223, UTC-00224, UTC-00226, UTC-00227, UTC-00229, UTC-00230, UTC-00232, UTC-00236, UTC-00239, UTC-00240, UTC-00241, UTC-00243, UTC-00244, 
UTC-00251, UTC-00253, UTC-00254, UTC-00256, UTC-00260, UTC-00263, UTC-00267, UTC-00270, UTC-00272, UTC-00273, UTC-00274, UTC-00276, UTC-00277, 
UTC-00280, UTC-00284.

Additionally, UTC-00163 and U+4153 have the same kSBGY, but UTC-00163 also seems like it might be a U+2354C variant, so I've cautiously not 
included it in the list above. Similarly, UTC-00261 has one of U+63E9's kSBGY values, but UTC-00261 also seems like it might be a U+22777 
variant, so I've cautiously not included it in the list above.

U+29921 and UTC-00265 both have kSBGY of 140.27 despite extremely different glyphs, and U+220F3 and UTC-00266 both have kSBGY of 149.18 despite 
extremely different glyphs; those two do not seem to fit the criteria for U-source "Variant" status as described in UAX #45 (although that needn't 
necessarily preclude referencing those as the related codepoint without changing the status), and so I have not included them in the list above.

---

Some further possible U-source horizontal extensions:

U+3FBF: UTC-00488 (UCV #355)


U+6EFA: UTC-00438 (UCV #305)

U+8803: UTC-00341 (matches some of the existing glyhs)

U+8D0B: UTC-00480 (matches the existing glyphs)

U+202AC: UTC-00257 (UCV #343)

U+27355: UTC-00194 (UCV #343)

U+28AD1: UTC-00432 (matches the existing glyphs)

U+2ED7D: UTC-00347

U+3191C: UTC-00672

U+31DB5: UTC-00719

U+31F0D: UTC-00731

U+31FA4: UTC-00321 (UCV #179, UCV #305)

---

A few NoAction status U-source ideographs, without kSBGY or kHanYu, where it might be possible to reference a related codepoint and/or change to 
Variant status:

UTC-00124 (maybe related to U+268FC?), UTC-00130 (maybe related to U+22029?), UTC-00134 (U+672C variant), UTC-00139 (U+24D67 variant?), UTC-00140 
(U+5169 variant?), UTC-00151 (U+7D78 variant?), UTC-00153 (U+7DDA variant?).

Also: UTC-01237 may be related to U+3049B (UTC L2/15-098R3 and the original UTC L2/15-177 state simply "Couldn't find printed evidence", and 
UTC L2/15-177R omits it).

Date/Time: Tue Mar 31 11:51:34 PT 2026
ReportID: ID20260331115134
Name: Judith Chen
Report Type: PRI Feedback
Opt Subject: PRI 534

IRG N2855 proposed a series of changes to the Unihan database, but they are not reflected in Unicode 18.0 Alpha. Therefore, based on the 
analysis in IRG N2855, I would like to repropose the following changes:

- U+6287
  - kGSR: (empty) -> 0304e
  - kHanYu: 31837.020 -> 31837.030
  - kIRGHanyuDaZidian: 31837.020 -> 31837.030
  - kHanyuPinyin: (empty) -> 31837.030:hú,gǔ

- U+22A8F
  - kGSR: 0304e -> (empty)
  - kHanYu: 31837.030 -> 31837.020
  - kIRGHanyuDaZidian: 31837.030 -> 31837.020
  - kKangXi: 0420.170 -> 0420.160
  - kHanyuPinyin: 31837.030:hú,gǔ -> (empty)

In addition, I recommend that the V-source reference values and representative glyphs for U+6287 and U+22A8F be swapped, and the J-source 
reference value and representative glyph for U+22A8F be moved to U+6287, based on the analysis in IRG N2855.


Should Vietnam agree with the changes proposed above, I suggest the following further updates:

- U+6287
  - kVietnamese: nhặt -> vét

- U+22A8F
  - kVietnamese: (empty) -> nhặt

Should Japan agree with the change proposed above, I suggest the following further updates:

- U+6287
  - kMojiJoho: (empty) -> MJ036722
  - kJapanese: (empty) -> コツ ゴチ コチ ケツ ガチ ほる

- U+22A8F
  - kMojiJoho: MJ036722 -> (empty)
  - kJapanese: コツ ゴチ コチ ケツ ガチ ほる -> (empty)

That is all.