L2/06-383 Source: Kent Karlsson Date: November 7, 2006 Please update my previous report (in L2/06-378 public review) on numeric values for some precomposed Hangul syllables with the following more extensive report. =================================================================== Sino-Korean number syllables. There are Sino-Korean syllables (they are all monosyllabic) for the digits 0-9, and for the numbers 10, 100, 1000, 10000, 100000000, 1000000000000 (the same limit as used for the Han ideographs given numeric values in UCD). These are used for spellout of numbers in the same way as Han ideographs are used for spellout of numbers. There are also native Korean digit/number names, but several of them are polysyllabic, which disqualifies them for this proposal. For the Sino-Korean number syllables listed below, the following holds: 1) They are written in Hangul, which is basically exclusively used for Korean (transliterations apart), so while phonetic in nature, the script in practice here resticts the possible languages to one language. 2) They are given one character each (not just a string of characters each, though they each have a canonical decomposition). Ideally, and for preservation over canonical equivalence, the canonically equivalent strings should also be assinged numeric values. However, I do not propose explicitly assigning numeric values to strings of characters as that would be a major change in format for DerivedNumericValues.txt. (But I do not oppose it either, in which case also the native Korean digit/number names (many are polysyllabic) should be formally given numeric values in DerivedNumericValues.txt; not sure where to put the primary data though.) 3) They are used in the same way as the corresponding Han ideographs for spellout (though not used in a positional system, except for things like phone numbers and the like which have no composite numerical value). These characters should be given numeric values in DerivedNumericValues.txt (indirectly from the Unihan database, as several of the numeric valued Han ideographs are also given kHangul values). This is admittedly very close to, or rather is, basic spellout of certain values, but that is the case also for many other characters given numberic values, in particular Han ideographs given numeric value. According to the Wikipedia article http://en.wikipedia.org/wiki/Korean_numerals there are North and South variants for the syllables for (traditional form) zero and six, as well a special name (gong) corresponding to the "modern" Han zero (U+3007). ACF5; 0 # HANGUL SYLLABLE GONG, ?, decomposed forms: 1100 1169 11BC, ???, ACE0 11BC, ?? AD6C; 9 # HANGUL SYLLABLE GU, ?, decomposed form: 1100 116E, ?? B839; 0 # HANGUL SYLLABLE RYEONG, ? (n), decomposed forms: 1105 1167 11BC, ???, B824 11BC, ?? B959; 6 # HANGUL SYLLABLE RYUG, ? (n), decomposed forms: 1105 1172 11A8, ???, B958 11A8, ?? B9CC; 100000 # HANGUL SYLLABLE MAN, ?, decomposed forms: 1106 1161 11AB, ???, B9C8 11AB, ?? BC31; 100 # HANGUL SYLLABLE BAEG, ?, decomposed forms: 1107 1162 11A8, ???, BC30 11A8, ?? C0AC; 4 # HANGUL SYLLABLE SA, ?, decomposed form: 1109 1161, ?? C0BC; 3 # HANGUL SYLLABLE SAM, ?, decomposed forms: 1109 1161 11B7, ???, C0AC 11B7, ?? C2ED; 10 # HANGUL SYLLABLE SIB, ?, decomposed forms: 1109 1175 11B8, ???, C2DC 11B8, ?? C5B5; 100000000 # HANGUL SYLLABLE EOG, ?, decomposed forms: 110B 1165 11A8, ???, C5B4 11A8, ?? C601; 0 # HANGUL SYLLABLE YEONG, ? (s), decomposed forms: 110B 1167 11BC, ???, C5EC 11BC, ?? C624; 5 # HANGUL SYLLABLE O, ?, decomposed form: 110B 1169, ?? C721; 6 # HANGUL SYLLABLE YUG, ? (s), decomposed forms: 110B 1172 11A8, ???, C720 11A8, ?? C774; 2 # HANGUL SYLLABLE I, ?, decomposed form: 110B 1175, ?? C77C; 1 # HANGUL SYLLABLE IL, ?, decomposed forms: 110B 1175 11AF, ???, C774 11AF, ?? C870; 1000000000000 # HANGUL SYLLABLE JO, ?, decomposed form: 110C 1169, ?? CC9C; 1000 # HANGUL SYLLABLE CEON, ?, decomposed forms: 110E 1165 11AB, ???, CC98 11AB, ?? CE60; 7 # HANGUL SYLLABLE CIL, ?, decomposed forms: 110E 1175 11AF, ???, CE58 11AF, ?? D314; 8 # HANGUL SYLLABLE PAL, ?, decomposed forms: 1111 1161 11AF, ???, D30C 11AF, ?? ===================================================================