Re: [unicode] Unihan database: kCangjie field

From: Charlie Ruland (ruland@luckymail.com)
Date: Sun Jun 14 2009 - 12:57:59 CDT

  • Next message: John H. Jenkins: "Re: [unicode] Unihan database: kCangjie field"

    If it is true that the Unihan database has Cangjie v.3 input codes for
    only 29,148 characters, whereas Malaysia’s Friends of Cangjie have
    Cangjie v.5 codes for all CJK[V] unified ideographs of Unicode 4.0, why
    not add a “kCangjie5” field based on the more exhaustive data from
    Malaysia to the Unihan database (or, entirely replace the Cangjie v.3
    data of the “kCangjie” field with the Cangjie v.5 data)?

    BTW, Malaysia’s Friends of Cangjie seem to be willing to have their data
    published: e.g., the English Wiktionary has the page
    http://en.wiktionary.org/wiki/Wiktionary:Chinese_Cangjie_index where it
    says: “Cāngjié data was taken from www.chinesecj.com with permission.”

    Charlie

    -------- Original Message --------
    Subject: Re: [unicode] Unihan database: kCangjie field
    From: mpsuzuki@hiroshima-u.ac.jp
    To: Charlie Ruland <ruland@luckymail.com>
    Date: Sun Jun 14 2009 07:30:59 GMT+0200
    > Hi,
    >
    > Checking the kCangjie entry for U+9762 (面) in Unihan.txt,
    > we can find this line:
    >
    > U+9762 kCangjie MWYL
    >
    > I guess, this is Cangjie version 3 style.
    > If it's version 5 style, it should be MWSL.
    >
    > http://zh.wikipedia.org/wiki/%E5%80%89%E9%A0%A1%E8%BC%B8%E5%85%A5%E6%B3%95
    >
    > According to UTR#38, kCangjie field is based on Christian
    > Wittern's cangjie-table.b5.
    >
    >
    >> Tag: kCangjie
    >> Status: Provisional
    >> Category: Dictionary-like Data
    >> Separator: space
    >> Syntax: [A-Z]+
    >> Description: The cangjie input code for the character.
    >> This incorporates data from the file cangjie-table.b5
    >> by Christian Wittern.
    >>
    >
    > According to Christian Wittern's web site at Kyoto Univ.,
    > it seems that he has not updated cangjie-table.b5 since
    > 1993-Nov.
    >
    > http://kanji.zinbun.kyoto-u.ac.jp/~wittern/publications/data/index.html
    >
    >> Cangjie Table: Table of all cangjie input keys,
    >> with radical / stroke and BIG5 code ,
    >> in: ftp://ifcss.org/software/data, November 1993.
    >>
    >
    > I think the popular version of cangjie-table.b5 used in
    > various free softwares is 1.02 released on 1993-May.
    > e.g.
    > http://linenum.info/p/emacs/22.1/leim/MISC-DIC/cangjie-table.b5?page=1
    > http://linenum.info/p/emacs/22.1/leim/MISC-DIC/cangjie-table.b5?page=27
    > It includes 13059 entries to cover Big5 with ETen extension.
    >
    > On the other hand, Unihan.txt 5.1.0 (2008-Mar-03) includes
    > 29148 entries. I don't know who added extra kCangjie to
    > cover the characters which are not included in original
    > cangjie-table.b5 by Christian.
    >
    > Regards,
    > mpsuzuki
    >
    > On Sat, 13 Jun 2009 19:14:49 +0200
    > Charlie Ruland <ruland@luckymail.com> wrote:
    >
    >
    >> The Cangjie input code of which Cangjie version is given in the Unihan
    >> database?
    >> I couldn't find any explicit information on this in the Unicode Standard
    >> Annex #38: Unicode Han Database (Unihan) at
    >> http://www.unicode.org/reports/tr38/ .
    >> FYI, I use a Cangjie version 5 IME (第五代倉頡輸入法) designed by and
    >> downloaded from Malaysia’s Friends of Cangjie (倉頡之友。馬來西亞 at
    >> http://www.chinesecj.com/newsoftware/index3.php?Type=1 ) and which
    >> promises to support input of some 70,000 characters.
    >> Are all Unihan kCangjie codes usable on my IME?
    >>
    >> Charlie
    >>
    >> --
    >> ___ Charlie Ruland ___ 冉書慧 ___
    >> ERROR__COMMVNIS__FACIT__IVS
    >>
    >>
    >>
    >
    >
    >

    -- 
    — Charlie Ruland — 冉書慧 —
    ERROR·COMMVNIS·FACIT·IVS
    


    This archive was generated by hypermail 2.1.5 : Sun Jun 14 2009 - 13:03:46 CDT