Re: johab compound letters reference for Hangul?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Dec 20 2003 - 17:46:51 EST

  • Next message: Philippe Verdy: "RE: johab compound letters reference for Hangul?"

    When looking at this document:
    http://std.dkuug.dk/JTC1/SC22/WG20/docs/n1051-hangulsort.pdf
    and its associated data file "n1051t-table-hangulctt6.txt",

    I noted that the proposed decompositions (which apparently
    work for collating most Hangul compound letters), that
    "Hangul trail consonnants" are not all correctly decomposed
    to jongseongs: occurences of <S1109> gives the weight of
    the _choseong_ SIOS, instead of the expected <S11BA> for
    the weight of the _jongseong_ SIOS...

    For example:

    (...)
            <U11C3> "<S11A8><S11AF>";"<BASE><BASE>";"<MIN><MIN>";<U11C3> %
    HANGUL JONGSEONG KIYEOK-RIEUL
    -> OK
            <U11AA> "<S11A8><S1109>";"<BASE><BASE>";"<MIN><MIN>";<U11AA> %
    HANGUL JONGSEONG KIYEOK-SIOS
    -> Is that correct?
            <U11C4>
    "<S11A8><S1109><S11A8>";"<BASE><BASE><BASE>";"<MIN><MIN><MIN>";<U11C4> %
    HANGUL JONGSEONG KIYEOK-SIOS-KIYEOK
    -> Is that correct?
    (...)
            <U11C6> "<S11AB><S11AE>";"<BASE><BASE>";"<MIN><MIN>";<U11C6> %
    HANGUL JONGSEONG NIEUN-TIKEUT
    -> OK
            <U11C7> "<S11AB><S1109>";"<BASE><BASE>";"<MIN><MIN>";<U11C7> %
    HANGUL JONGSEONG NIEUN-SIOS
    -> Is that correct?
            <U11C8> "<S11AB><S11EB>";"<BASE><BASE>";"<MIN><MIN>";<U11C8> %
    HANGUL JONGSEONG NIEUN-PANSIOS
    -> OK
    (...)
            <U11D5>
    "<S11AF><S11B8><K115F><S11BC>";"<BASE><BASE><BASE><BASE>";"<MIN><MIN><MIN><M
    IN>";<U11D5> % HANGUL JONGSEONG RIEUL-KAPYEOUNPIEUP (RIEUL-PIEUP-IEUNG)
    -> OK
            <U11B3> "<S11AF><S1109>";"<BASE><BASE>";"<MIN><MIN>";<U11B3> %
    HANGUL JONGSEONG RIEUL-SIOS
    -> Is that correct?
            <U11D6>
    "<S11AF><S1109><S1109>";"<BASE><BASE><BASE>";"<MIN><MIN><MIN>";<U11D6> %
    HANGUL JONGSEONG RIEUL-SSANGSIOS
    -> Is that correct?
            <U11D7> "<S11AF><S11EB>";"<BASE><BASE>";"<MIN><MIN>";<U11D7> %
    HANGUL JONGSEONG RIEUL-PANSIOS
    -> OK
    (...)

    Is this choice an editorial error for an intermediate, not fully tested data
    file, or a hack to make it work along with actual Korean dictionnaries?
    If I just had to implement what was written there, jongseongs containing
    SIOS or SSANGSIOS would not sort with jongseongs cantaining PANSIOS or after
    those containing PIEUP...

    Philippe.

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Sat Dec 20 2003 - 18:26:46 EST