johab compound letters reference for Hangul?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Dec 19 2003 - 14:10:53 EST

  • Next message: Peter Kirk: "Re: [OT] Keyboards (was: American English translation of character names)"

    Is there a definitive reference for johab compound composition in Hangul?

    As an example, I use these extra decompositions (not defined in Unicode) in my application, but I'd like advice notably for KAPYEOUN choseongs: Is it decomposable as the horizontal 114C (so that it stacks below), or as the vertical 110B (which normally stacks side-by-side)?

    My set of overriden UnicodeData lines for CHOSEONGs is as below:

    (...)
    # add canonical de/recomposition of "Johab" compound Hangul jamos
    #1100;HANGUL CHOSEONG KIYEOK;Lo;0;L;;;;;N;;g *;;;
    1101;HANGUL CHOSEONG SSANGKIYEOK;Lo;0;L;<johab> 1100 1100;;;;N;;gg *;;;
    #1102;HANGUL CHOSEONG NIEUN;Lo;0;L;;;;;N;;n *;;;
    1113;HANGUL CHOSEONG NIEUN-KIYEOK;Lo;0;L;<johab> 1102 1100;;;;N;;ng *;;;
    1114;HANGUL CHOSEONG SSANGNIEUN;Lo;0;L;<johab> 1102 1102;;;;N;;nn *;;;
    1115;HANGUL CHOSEONG NIEUN-TIKEUT;Lo;0;L;<johab> 1102 1103;;;;N;;nd *;;;
    1116;HANGUL CHOSEONG NIEUN-PIEUP;Lo;0;L;<johab> 1102 1107;;;;N;;nb *;;;
    #1103;HANGUL CHOSEONG TIKEUT;Lo;0;L;;;;;N;;d *;;;
    1117;HANGUL CHOSEONG TIKEUT-KIYEOK;Lo;0;L;<johab> 1103 1100;;;;N;;dg *;;;
    1104;HANGUL CHOSEONG SSANGTIKEUT;Lo;0;L;<johab> 1103 1103;;;;N;;dd *;;;
    #1105;HANGUL CHOSEONG RIEUL;Lo;0;L;;;;;N;;r *;;;
    1118;HANGUL CHOSEONG RIEUL-NIEUN;Lo;0;L;<johab> 1105 1102;;;;N;;rn *;;;
    1119;HANGUL CHOSEONG SSANGRIEUL;Lo;0;L;<johab> 1105 1105;;;;N;;rr *;;;
    111A;HANGUL CHOSEONG RIEUL-HIEUH;Lo;0;L;<johab> 1105 1112;;;;N;;rh *;;;
    111B;HANGUL CHOSEONG KAPYEOUNRIEUL;Lo;0;L;<johab> 1105 114C;;;;N;;rq *;;;
    #1106;HANGUL CHOSEONG MIEUM;Lo;0;L;;;;;N;;m *;;;
    111C;HANGUL CHOSEONG MIEUM-PIEUP;Lo;0;L;<johab> 1106 1107;;;;N;;mb *;;;
    111D;HANGUL CHOSEONG KAPYEOUNMIEUM;Lo;0;L;<johab> 1106 114C;;;;N;;mq *;;;
    #1107;HANGUL CHOSEONG PIEUP;Lo;0;L;;;;;N;;b *;;;
    111E;HANGUL CHOSEONG PIEUP-KIYEOK;Lo;0;L;<johab> 1107 1100;;;;N;;bg *;;;
    111F;HANGUL CHOSEONG PIEUP-NIEUN;Lo;0;L;<johab> 1107 1102;;;;N;;bn *;;;
    1120;HANGUL CHOSEONG PIEUP-TIKEUT;Lo;0;L;<johab> 1107 1103;;;;N;;bd *;;;
    1108;HANGUL CHOSEONG SSANGPIEUP;Lo;0;L;<johab> 1107 1107;;;;N;;bb *;;;
    112C;HANGUL CHOSEONG KAPYEOUNSSANGPIEUP;Lo;0;L;<johab> 1108 114C;;;;N;;bbq *;;;
    1121;HANGUL CHOSEONG PIEUP-SIOS;Lo;0;L;<johab> 1107 1109;;;;N;;bs *;;;
    1122;HANGUL CHOSEONG PIEUP-SIOS-KIYEOK;Lo;0;L;<johab> 1107 1109 1100;;;;N;;bsg *;;;
    1123;HANGUL CHOSEONG PIEUP-SIOS-TIKEUT;Lo;0;L;<johab> 1107 1109 1103;;;;N;;bsd *;;;
    1124;HANGUL CHOSEONG PIEUP-SIOS-PIEUP;Lo;0;L;<johab> 1107 1109 1107;;;;N;;bsb *;;;
    1125;HANGUL CHOSEONG PIEUP-SSANG SIOS;Lo;0;L;<johab> 1107 110A;;;;N;;bss *;;;
    1126;HANGUL CHOSEONG PIEUP-SIOS-CIEUC;Lo;0;L;<johab> 1107 1109 110C;;;;N;;bsj *;;;
    1127;HANGUL CHOSEONG PIEUP-CIEUC;Lo;0;L;<johab> 1107 110C;;;;N;;bj *;;;
    1128;HANGUL CHOSEONG PIEUP-CHIEUCH;Lo;0;L;<johab> 1107 110E;;;;N;;bc *;;;
    1129;HANGUL CHOSEONG PIEUP-THIEUTH;Lo;0;L;<johab> 1107 1110;;;;N;;bd *;;;
    112A;HANGUL CHOSEONG PIEUP-PHIEUPH;Lo;0;L;<johab> 1107 1111;;;;N;;bp *;;;
    112B;HANGUL CHOSEONG KAPYEOUNPIEUP;Lo;0;L;<johab> 1107 114C;;;;N;;bq *;;;
    #1109;HANGUL CHOSEONG SIOS;Lo;0;L;;;;;N;;s *;;;
    112D;HANGUL CHOSEONG SIOS-KIYEOK;Lo;0;L;<johab> 1109 1100;;;;N;;sg *;;;
    112E;HANGUL CHOSEONG SIOS-NIEUN;Lo;0;L;<johab> 1109 1102;;;;N;;sn *;;;
    112F;HANGUL CHOSEONG SIOS-TIKEUT;Lo;0;L;<johab> 1109 1103;;;;N;;sd *;;;
    1130;HANGUL CHOSEONG SIOS-RIEUL;Lo;0;L;<johab> 1109 1105;;;;N;;sr *;;;
    1131;HANGUL CHOSEONG SIOS-MIEUM;Lo;0;L;<johab> 1109 1106;;;;N;;sm *;;;
    1132;HANGUL CHOSEONG SIOS-PIEUP;Lo;0;L;<johab> 1109 1107;;;;N;;sb *;;;
    1133;HANGUL CHOSEONG SIOS-PIEUP-KIYEOK;Lo;0;L;<johab> 1109 1107 1100;;;;N;;sbg *;;;
    110A;HANGUL CHOSEONG SSANGSIOS;Lo;0;L;<johab> 1109 1109;;;;N;;ss *;;;
    1134;HANGUL CHOSEONG SIOS-SSANGSIOS;Lo;0;L;<johab> 1109 110A;;;;N;;sss *;;;
    1135;HANGUL CHOSEONG SIOS-IEUNG;Lo;0;L;<johab> 1109 110B;;;;N;;s' *;;;
    1136;HANGUL CHOSEONG SIOS-CIEUC;Lo;0;L;<johab> 1109 110C;;;;N;;sj *;;;
    1137;HANGUL CHOSEONG SIOS-CHIEUCH;Lo;0;L;<johab> 1109 110E;;;;N;;sc *;;;
    1138;HANGUL CHOSEONG SIOS-KHIEUKH;Lo;0;L;<johab> 1109 110F;;;;N;;sk *;;;
    1139;HANGUL CHOSEONG SIOS-THIEUTH;Lo;0;L;<johab> 1109 1110;;;;N;;st *;;;
    113A;HANGUL CHOSEONG SIOS-PHIEUPH;Lo;0;L;<johab> 1109 1111;;;;N;;sp *;;;
    113B;HANGUL CHOSEONG SIOS-HIEUH;Lo;0;L;<johab> 1109 1112;;;;N;;sh *;;;
    #113C;HANGUL CHOSEONG CHITUEUMSIOS;Lo;0;L;;;;;N;;zs *;;;
    113D;HANGUL CHOSEONG CHITUEUMSSANGSIOS;Lo;0;L;<johab> 113C 113C;;;;N;;zss *;;;
    #113E;HANGUL CHOSEONG CEONGCHIEUMSIOS;Lo;0;L;;;;;N;;sz *;;;
    113F;HANGUL CHOSEONG CEONGCHIEUMSSANGSIOS;Lo;0;L;<johab> 113E 113E;;;;N;;ssz *;;;
    #1140;HANGUL CHOSEONG PANSIOS;Lo;0;L;;;;;N;;zz *;;;
    #110B;HANGUL CHOSEONG IEUNG;Lo;0;L;;;;;N;;' *;;;
    1141;HANGUL CHOSEONG IEUNG-KIYEOK;Lo;0;L;<johab> 110B 1100;;;;N;;'g *;;;
    1142;HANGUL CHOSEONG IEUNG-TIKEUT;Lo;0;L;<johab> 110B 1103;;;;N;;'d *;;;
    1143;HANGUL CHOSEONG IEUNG-MIEUM;Lo;0;L;<johab> 110B 1106;;;;N;;'m *;;;
    1144;HANGUL CHOSEONG IEUNG-PIEUP;Lo;0;L;<johab> 110B 1107;;;;N;;'b *;;;
    1145;HANGUL CHOSEONG IEUNG-SIOS;Lo;0;L;<johab> 110B 1109;;;;N;;'s *;;;
    1146;HANGUL CHOSEONG IEUNG-PANSIOS;Lo;0;L;<johab> 110B 1140;;;;N;;'zz *;;;
    1147;HANGUL CHOSEONG SSANGIEUNG;Lo;0;L;<johab> 110B 110B;;;;N;;'';;;
    1148;HANGUL CHOSEONG IEUNG-CIEUC;Lo;0;L;<johab> 110B 110C;;;;N;;'j *;;;
    1149;HANGUL CHOSEONG IEUNG-CHIEUCH;Lo;0;L;<johab> 110B 110E;;;;N;;'c *;;;
    114A;HANGUL CHOSEONG IEUNG-THIEUTH;Lo;0;L;<johab> 110B 1110;;;;N;;'t *;;;
    114B;HANGUL CHOSEONG IEUNG-PHIEUPH;Lo;0;L;<johab> 110B 1111;;;;N;;'p *;;;
    #114C;HANGUL CHOSEONG YESIEUNG;Lo;0;L;;;;;N;;q *;;;
    #110C;HANGUL CHOSEONG CIEUC;Lo;0;L;;;;;N;;j *;;;
    114D;HANGUL CHOSEONG CIEUC-IEUNG;Lo;0;L;<johab> 110C 110B;;;;N;;j' *;;;
    110D;HANGUL CHOSEONG SSANGCIEUC;Lo;0;L;<johab> 110C 110C;;;;N;;jj *;;;
    #114E;HANGUL CHOSEONG CHITUEUMCIEUC;Lo;0;L;;;;;N;;zj *;;;
    114F;HANGUL CHOSEONG CHITUEUMSSANGCIEUC;Lo;0;L;<johab> 114E 114E;;;;N;;zjj *;;;
    #1150;HANGUL CHOSEONG CEONGCHIEUMCIEUC;Lo;0;L;;;;;N;;jz *;;;
    1151;HANGUL CHOSEONG CEONGCHIEUMSSANGCIEUC;Lo;0;L;<johab> 1150 1150;;;;N;;jjz *;;;
    #110E;HANGUL CHOSEONG CHIEUCH;Lo;0;L;;;;;N;;c *;;;
    1152;HANGUL CHOSEONG CHIEUCH-KHIEUKH;Lo;0;L;<johab> 110E 110F;;;;N;;ck *;;;
    1153;HANGUL CHOSEONG CHIEUCH-HIEUH;Lo;0;L;<johab> 110E 1112;;;;N;;ch *;;;
    #1154;HANGUL CHOSEONG CHITUEUMCHIEUCH;Lo;0;L;;;;;N;;zc *;;;
    #1155;HANGUL CHOSEONG CEONGCHIEUMCHIEUCH;Lo;0;L;;;;;N;;cz *;;;
    #110F;HANGUL CHOSEONG KHIEUKH;Lo;0;L;;;;;N;;k *;;;
    #1110;HANGUL CHOSEONG THIEUTH;Lo;0;L;;;;;N;;t *;;;
    #1111;HANGUL CHOSEONG PHIEUPH;Lo;0;L;;;;;N;;p *;;;
    1156;HANGUL CHOSEONG PHIEUPH-PIEUP;Lo;0;L;<johab> 1111 1107;;;;N;;pb *;;;
    1157;HANGUL CHOSEONG KAPYEOUNPHIEUPH;Lo;0;L;<johab> 1111 110B;;;;N;;pq *;;;
    #1112;HANGUL CHOSEONG HIEUH;Lo;0;L;;;;;N;;h *;;;
    1158;HANGUL CHOSEONG SSANGHIEUH;Lo;0;L;<johab> 1112 1112;;;;N;;hh *;;;
    #1159;HANGUL CHOSEONG YEORINHIEUH;Lo;0;L;;;;;N;;h' *;;;
    (...)

    It generates these:
    # gen/normalize.txt
    #code;cc;nfd;nfkdFolded;# CHAR ;NFD;NFKDFOLDED;
    (...)
    1101;;;1100 1100;# ?? ;?;???;
    1113;;;1102 1100;# ?? ;?;???;
    1114;;;1109 1109;# ?? ;?;???;
    1115;;;1102 1103;# ?? ;?;???;
    1116;;;1102 1107;# ?? ;?;???;
    1117;;;1103 1100;# ?? ;?;???;
    1118;;;1105 1102;# ?? ;?;???;
    1119;;;1105 1105;# ?? ;?;???;
    111A;;;1105 1112;# ?? ;?;???;
    111B;;;1105 114C;# ?? ;?;???;
    111D;;;1106 114C;# ?? ;?;???;
    111E;;;1107 1100;# ?? ;?;???;
    111F;;;1107 1102;# ?? ;?;???;
    1120;;;1107 1103;# ?? ;?;???;
    112C;;;1107 114C;# ?? ;?;???;
    112D;;;1109 1100;# ?? ;?;???;
    112E;;;1109 1102;# ?? ;?;???;
    112F;;;1109 1103;# ?? ;?;???;
    1130;;;1109 1105;# ?? ;?;???;
    1131;;;1109 1106;# ?? ;?;???;
    1132;;;1109 1107;# ?? ;?;???;
    1133;;;1109 1107 1100;# ?? ;?;????;
    1134;;;1109 1109 1109;# ?? ;?;????;
    1135;;;1109 110B;# ?? ;?;???;
    1136;;;1109 110C;# ?? ;?;???;
    1138;;;1109 110F;# ?? ;?;???;
    1139;;;1109 1110;# ?? ;?;???;
    113A;;;1109 1111;# ?? ;?;???;
    113B;;;1109 1112;# ?? ;?;???;
    113D;;;113C 113C;# ?? ;?;???;
    113F;;;113E 113E;# ?? ;?;???;
    1141;;;110B 1100;# ?? ;?;???;
    1143;;;110B 1106;# ?? ;?;???;
    1144;;;110B 1107;# ?? ;?;???;
    1145;;;110B 1109;# ?? ;?;???;
    1146;;;110B 1140;# ?? ;?;???;
    1147;;;110B 110B;# ?? ;?;???;
    1148;;;110B 110C;# ?? ;?;???;
    1149;;;110B 110E;# ?? ;?;???;
    114B;;;110B 1111;# ?? ;?;???;
    114D;;;110C 110B;# ?? ;?;???;
    114F;;;114E 114E;# ?? ;?;???;
    1151;;;1150 1150;# ?? ;?;???;
    1152;;;110E 110F;# ?? ;?;???;
    1153;;;110E 1112;# ?? ;?;???;
    1156;;;1111 1107;# ?? ;?;???;
    1158;;;1112 1112;# ?? ;?;???;
    (...)

    I have similar rules for some JUNGSEONG's (middle vowels) and JONGSEONG's (trailing consonnants), and I also use these rules to generate romanizations of the complete Hangul set (and not only for the jamos that are decomposed from standard encoded syllables), as shown in the "ISO comment" column that exhibits these default romanization.

    But I'd like also some advice on the chosen letters (notably because there probably exists such standard, used for example in romanized input methods for Hangul keyboards).

    The intent is to mimic the behavior of search engines that deal with various Korean Hangul charsets that are not always based on the Johab and Wangsung sets of Hangul jamos, and build a lexeme parser for searches in full text (in this case the romanization is just a of the simplified basic jamos to lowercase Latin letters, where choseong sequences are coded with a leading uppercase letter unless it is a IEUNG (in which case I insert a space to keep syllables separable).

    Any advice?

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com



    This archive was generated by hypermail 2.1.5 : Fri Dec 19 2003 - 15:00:58 EST