Re: Proposed Update of UTS #10: Unicode Collation Algorithm

From: Jungshik Shin (jshin@mailaps.org)
Date: Fri May 16 2003 - 23:03:35 EDT

  • Next message: John Cowan: "Re: John's Own Version of Unicode Conformance, Version 4.0"

    On Fri, 16 May 2003, Mark Davis wrote:

    >> To take the same example as I took in my previous email, I don't see
    >> how S1,S2 and S3 could be sorted S1 < S2 < S3 (instead of S1 < S3 < S2)
    >> without contracting the sequence of 'U+1169 (ㅗ:HANGUL JUNGSEONG O)
    >> U+1163 (ㅑ:HANGUL JUNGSEONG YA)'?
    >>
    >> S1: U+1100 (ᄀ:HANGUL CHOSEONG KIYEOK)
    >> U+1169 (ㅗ:HANGUL JUNGSEONG O)
    >> U+11A8 (ㄱ:HANGUL JONGSEONG KIYEOK)
    >> S2: U+1100 (ᄀ:HANGUL CHOSEONG KIYEOK)
    >> U+116A (ㅘ:HANGUL JUNGSEONG WA)
    >> U+11A8 (ㄱ:HANGUL JONGSEONG KIYEOK)
    >> S3: U+1100 (ᄀ:HANGUL CHOSEONG KIYEOK)
    >> U+1169 (ㅗ:HANGUL JUNGSEONG O)
    >> U+1163 (ㅑ:HANGUL JUNGSEONG YA)
    >> U+11A8 (ㄱ:HANGUL JONGSEONG KIYEOK)
    >>
    >> where the primary weights of each Jamo are given as following,
    >>
    >> U+1100 (ᄀ:HANGUL CHOSEONG KIYEOK) : 301
    >> U+1161 (ㅏ:HANGUL JUNGSEONG A) : 201
    >> U+1163 (ㅑ:HANGUL JUNGSEONG YA) : 231
    >> U+1169 (ㅗ:HANGUL JUNGSEONG O) : 251
    >> U+116A (ㅘ:HANGUL JUNGSEONG WA) : 255
    >> U+11A8 (ㄱ:HANGUL JONGSEONG KIYEOK) : 101
    >
    > Remember, the weights have to be changed so that: T < V < L, so I'll
    > add 3000 to Ls, 2000 to Vs and 1000 to Ts

      My weights above were assigned with that in mind. L's are
    assigned 300's while V's and T's are assigne 200's and 100's,
    respectively.

    > S1 => 3301; 2251; 1101; TERM
    > S2 => 3301; 2255; 1101; TERM
    > S3 => 3301; 2251; 1231; 1101; TERM

      S3 => 3301; 2251; 2231; 1101; TERM

      Anyway, with or without adding 3000/2000/1000, my point stands. We
    have S1 < S3 < S2 instead of S1 < S2 < S3.

       Jungshik



    This archive was generated by hypermail 2.1.5 : Fri May 16 2003 - 23:51:39 EDT