Stats on “CJKU_SR.txt” and “CJKC_SR.txt” (2010-04-04)

Total references per Source “S”

S Total BMP SIP ExtB ExtC ExtD CI1.1 CI3.2 CI4.1 CI5.2 CS3.1 UI1.1 UI4.1 UI5.1 UI5.2 XA3.0 XB3.1 XC5.2 XD6.0
75074 27992 47082 42711 4149 222 302 59 106 3 542 20902 22 8 8 6582 42711 4149 222
G 58828 27105 31723 30528 1119 76 0 0 0 0 0 20902 8 1 0 6194 30528 1119 76
T 56764 24276 31952 30177 1751 24 1 0 0 0 536 18368 0 1 0 5906 30177 1751 24
P 24130 18306 5774 5766 8 0 1 0 105 0 50 15010 0 1 0 3189 5766 8 0
K 18065 17495 570 166 404 0 268 0 0 0 0 15391 0 0 0 1836 166 404 0
J 14164 13387 777 303 367 107 24 59 0 3 0 12560 0 0 3 738 303 367 107
V 10082 5066 5016 4232 784 0 1 0 0 0 0 4757 0 0 0 308 4232 784 0
H 4579 2866 1702 1701 1 0 1 0 0 0 11 2272 14 0 5 574 1701 1 0
U 135 41 94 0 75 19 34 0 0 0 0 0 0 7 0 0 0 75 19
M 16 0 16 0 16 0 0 0 0 0 0 0 0 0 0 0 0 16 0
186763 108542 77624 72873 4525 226 330 59 105 3 597 89260 22 10 8 18745 72873 4525 226
G=PRC, T=TCA/ROC, P=KP=DPRK, K=ROK, J=Japan, V=Vietnam, H=HK, U=US/TUS, M=Macao; SIP=ExtB+ExtC+ExtD(excludes CS3.1) CI = CJK Compatibility Ideographs; CS = CJK Compatibility Supplement; UI = CJK Unified Ideographs; XA = CJK Unified Ideographs Extension A; XB = CJK Unified Ideographs Extension B; XC = CJK Unified Ideographs Extension C; XD = CJK Unified Ideographs Extension D
mysql$ select * from cjkranges where unihan ='1';
+--------------+------------------------------------+---------+--------+
| codepoints   | blockname                          | version | unihan |
+--------------+------------------------------------+---------+--------+
| 3400..4DB5   | CJK Unified Ideographs Extension A | 3.0     |      1 | 
| 4E00..9FA5   | CJK Unified Ideographs             | 1.1     |      1 | 
| 9FA6..9FBB   | CJK Unified Ideographs             | 4.1     |      1 | 
| 9FBC..9FC3   | CJK Unified Ideographs             | 5.1     |      1 | 
| 9FC4..9FCB   | CJK Unified Ideographs             | 5.2     |      1 | 
| F900..FA2D   | CJK Compatibility Ideographs       | 1.1     |      1 | 
| FA30..FA6A   | CJK Compatibility Ideographs       | 3.2     |      1 | 
| FA6B..FA6D   | CJK Compatibility Ideographs       | 5.2     |      1 | 
| FA70..FAD9   | CJK Compatibility Ideographs       | 4.1     |      1 | 
| 20000..2A6D6 | CJK Unified Ideographs Extension B | 3.1     |      1 | 
| 2A700..2B734 | CJK Unified Ideographs Extension C | 5.2     |      1 | 
| 2B740..2B81D | CJK Unified Ideographs Extension D | 6.0     |      1 | 
| 2F800..2FA1D | CJK Compatibility Supplement       | 3.1     |      1 | 
+--------------+------------------------------------+---------+--------+
13 rows in set (0.00 sec)

“N” == Total sources “S” per codepoint

N Total BMP SIP ExtB ExtC ExtD Sources S:Total(==BMP+SIP==BMP+ExtB+ExtC+ExtD)
1 26009 2964 23045 19039 3788 218 G:10596(2198+7386+936+76) T:8492(45+6945+1478+24) V:4340(0+3624+716+0) K:920(471+111+338+0) H:818(19+799+0+0) J:681(103+174+301+103) P:105(105+0+0+0) U:47(23+0+9+15) M:10(0+0+10+0)
2 20987 3127 17860 17509 347 4 GT:19317(2540+16677+100+0) GH:289(185+104+0+0) HT:260(6+254+0+0) GJ:237(217+15+5+0) TV:140(0+97+43+0) GP:136(17+119+0+0) GV:120(19+89+12+0) JT:84(15+15+54+0) KT:72(14+9+49+0) GK:62(51+7+4+0) PT:54(5+45+4+0) GU:52(0+0+52+0) HV:52(0+52+0+0) HK:20(12+7+1+0) JU:20(16+0+0+4) KP:20(15+3+2+0) JK:9(7+1+1+0) JV:9(0+7+2+0) HJ:8(4+4+0+0) KV:8(2+3+3+0) TU:7(1+0+6+0) MT:4(0+0+4+0) UV:3(0+0+3+0) GM:1(0+0+1+0) JP:1(1+0+0+0) KU:1(0+0+1+0) PV:1(0+1+0+0)
3 10850 4991 5859 5846 13 0 GPT:7676(2391+5285+0+0) GKT:1149(1136+11+2+0) GHT:760(503+257+0+0) GJT:667(631+33+3+0) GTV:341(137+202+2+0) GKP:49(49+0+0+0) GJK:43(43+0+0+0) GHJ:33(31+2+0+0) GHV:23(11+12+0+0) HTV:17(0+17+0+0) GJP:16(16+0+0+0) GHK:12(10+2+0+0) GHP:11(4+7+0+0) HKT:8(3+5+0+0) GJV:6(5+1+0+0) HJT:6(3+3+0+0) JPT:6(6+0+0+0) GPV:5(2+3+0+0) HPT:3(0+3+0+0) KPT:3(3+0+0+0) PTV:3(0+2+1+0) HJP:2(1+1+0+0) HKP:2(2+0+0+0) JTV:2(1+0+1+0) KTU:2(0+0+2+0) GMT:1(0+0+1+0) GUV:1(0+0+1+0) JKP:1(1+0+0+0) JKT:1(1+0+0+0) JUV:1(1+0+0+0)
5 6857 6847 10 10 0 0 GJKPT:5907(5907+0+0+0) GHKPT:258(258+0+0+0) GKPTV:255(255+0+0+0) GHJKT:163(163+0+0+0) GHJPT:102(99+3+0+0) GJKTV:79(79+0+0+0) GHPTV:33(28+5+0+0) GJPTV:25(23+2+0+0) GHKTV:15(15+0+0+0) GHJKP:9(9+0+0+0) GHJTV:5(5+0+0+0) GHJKV:2(2+0+0+0) GHKPV:2(2+0+0+0) GJKPV:2(2+0+0+0)
4 5506 5198 308 307 1 0 GKPT:3344(3337+7+0+0) GJKT:537(537+0+0+0) GJPT:486(450+36+0+0) GHPT:460(314+146+0+0) GPTV:222(124+98+0+0) GKTV:109(109+0+0+0) GHJT:93(90+3+0+0) GHKT:91(91+0+0+0) GHTV:73(59+14+0+0) GJTV:30(28+2+0+0) GJKP:25(25+0+0+0) GHJK:18(18+0+0+0) GHKP:6(6+0+0+0) GJKV:6(6+0+0+0) GHJP:1(1+0+0+0) GHJV:1(1+0+0+0) GHKV:1(1+0+0+0) GKPV:1(1+0+0+0) HJTV:1(0+1+0+0) KPTU:1(0+0+1+0)
6 4724 4724 0 0 0 0 GJKPTV:3954(3954+0+0+0) GHJKPT:716(716+0+0+0) GHKPTV:25(25+0+0+0) GHJKTV:17(17+0+0+0) GHJPTV:12(12+0+0+0)
7 140 140 0 0 0 0 GHJKPTV:140(140+0+0+0)
0 1 1 0 0 0 0 ???:1(1+0+0+0)
“N” in the above list is number of sources per codepoint. Data in Sources column such as “G:10596(2198+7386+936+76)” shows breakdown per source combination: in this case, “G” source contributes the only source mapping for 10596 codepoints, == (2198 BMP + 7386 ExtB + 936 ExtC + 76 Ext D). For N==3, “GPT:7676(2391+5285+0+0)” means that of the 10850 codepoints with a total of 3 NB source references, 7676 of those have the “GPT” combination of G, KP, and T source references, (2391 BMP + 5285 Ext B + 0 ExtC + 0 ExtD).

Last modified: Sat Apr 17 18:54:13 2010

Valid XHTML 1.0 Transitional