Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)

From: Ed Trager (ed.trager@gmail.com)
Date: Tue Oct 30 2007 - 11:43:43 CST

  • Next message: John H. Jenkins: "Re: Level of Unicode support required for various languages"

    Hi, Jeroen,

    If you are interested, you might want to take a look at my "知了漢英詞典"
    Chinese-English dictionary, currently available at:

            http://eyegene.ophthy.med.umich.edu/hanyu-xml/

    This is an AJAX-based Chinese-English dictionary that I put together
    -- It is actually still in a very unfinished state as I don't have
    time to deal with it at the moment -- just one of many "works in
    progress." Be sure to use Firefox or Opera or Safari -- My apologies,
    but I have not yet done the work needed to get it working properly in
    Internet Explorer.

    You can basically type anything into the box and get a result back.
    Hanzi (Kanji) of course. But you can also type in pinyin with or
    without tone marks ("mao", "mao2", mao ze dong", etc.). Or try
    bopomofo (there is a handy entry assistance table on the site as most
    people don't have bopomofo keyboard layouts. Or try English (try
    "net", "network", etc.).

    But perhaps the most interesting of all is that you can also do lookup
    by radical and stroke count just as you would with a traditional
    Chinese dictionary: Clicking on the "部首输入表" button displays a table of
    traditional and simplified radicals. Then you can click on a radical
    in the table: click on the three-dot water radical "⺡" (水) for example
    and now you can use the custom "spin widget" to increment/decrement
    the number of strokes after the radical and see what the query result
    set looks like. So "水0" of course just returns "水" (water), "水1"
    returns "永" (forever), "水2" returns "汉" (Han Chinese) among others,
    and so on. The radical+stroke count mapping to entries in the
    database is actually still fairly small just because I haven't worked
    on this much.

    I would like to add additional systems of lookup. For example, it
    would be nice to add the "four corner" method (四角號碼,
    http://en.wikipedia.org/wiki/Four_corner_input) to my online
    dictionary. (I have a Four Corner dictionary, but personally I rarely
    find what I want using this method and usually revert to using
    radical+stroke count. But I guess some people find it very easy to
    use, and of course it was very widely used in the recent past).

    I'm not sure if I would add Halpern's SKIP method or not -- I think he
    expects royalty payment for use of his system, so it would only make
    sense to do so if I ever turned this into a commercially-supported
    site which it obviously is not at the moment. Still, the SKIP system
    does appear to be quite nice for looking up Hanzi/Kanji.

    If I do get a chance to add additional lookup methods, I will still
    try to maintain the single text-entry box for the search term. This
    does complicate the code on the back end, but results in a much
    simpler "user interface" which I really prefer.

    - Ed Trager

    >
    > For my own project at http://www.rangaku.org/ I've been looking at ways for
    > allowing users to construct their kanji based on parts in order to get
    > dictionary results. I guess my usage falls in between SKIP and your idea/goal.
    > It definitely is something I am personally interested in, though, and would
    > love to help work on.
    >
    > --
    > Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
    > イェルーン ラウフロック ヴァン デル ウェルヴェン
    > http://www.in-nomine.org/ | http://www.rangaku.org/
    > The intuitive mind is a sacred gift and the rational mind is a faithful
    > servant. We have created a society that honors the servant and has
    > forgotten the gift...
    >



    This archive was generated by hypermail 2.1.5 : Tue Oct 30 2007 - 11:45:02 CST