Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)

From: Ed Trager (ed.trager@gmail.com)
Date: Tue Oct 30 2007 - 11:43:43 CST

Next message: John H. Jenkins: "Re: Level of Unicode support required for various languages"

Previous message: Peter Constable: "RE: Level of Unicode support required for various languages"
In reply to: Jeroen Ruigrok van der Werven: "Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)"
Next in thread: vunzndi@vfemail.net: "Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hi, Jeroen,

If you are interested, you might want to take a look at my "知了漢英詞典"
Chinese-English dictionary, currently available at:

http://eyegene.ophthy.med.umich.edu/hanyu-xml/

This is an AJAX-based Chinese-English dictionary that I put together
-- It is actually still in a very unfinished state as I don't have
time to deal with it at the moment -- just one of many "works in
progress." Be sure to use Firefox or Opera or Safari -- My apologies,
but I have not yet done the work needed to get it working properly in
Internet Explorer.

You can basically type anything into the box and get a result back.
Hanzi (Kanji) of course. But you can also type in pinyin with or
without tone marks ("mao", "mao2", mao ze dong", etc.). Or try
bopomofo (there is a handy entry assistance table on the site as most
people don't have bopomofo keyboard layouts. Or try English (try
"net", "network", etc.).

But perhaps the most interesting of all is that you can also do lookup
by radical and stroke count just as you would with a traditional
Chinese dictionary: Clicking on the "部首输入表" button displays a table of
traditional and simplified radicals. Then you can click on a radical
in the table: click on the three-dot water radical "⺡" (水) for example
and now you can use the custom "spin widget" to increment/decrement
the number of strokes after the radical and see what the query result
set looks like. So "水0" of course just returns "水" (water), "水1"
returns "永" (forever), "水2" returns "汉" (Han Chinese) among others,
and so on. The radical+stroke count mapping to entries in the
database is actually still fairly small just because I haven't worked
on this much.

I would like to add additional systems of lookup. For example, it
would be nice to add the "four corner" method (四角號碼,
http://en.wikipedia.org/wiki/Four_corner_input) to my online
dictionary. (I have a Four Corner dictionary, but personally I rarely
find what I want using this method and usually revert to using
radical+stroke count. But I guess some people find it very easy to
use, and of course it was very widely used in the recent past).

I'm not sure if I would add Halpern's SKIP method or not -- I think he
expects royalty payment for use of his system, so it would only make
sense to do so if I ever turned this into a commercially-supported
site which it obviously is not at the moment. Still, the SKIP system
does appear to be quite nice for looking up Hanzi/Kanji.

If I do get a chance to add additional lookup methods, I will still
try to maintain the single text-entry box for the search term. This
does complicate the code on the back end, but results in a much
simpler "user interface" which I really prefer.

- Ed Trager

>
> For my own project at http://www.rangaku.org/ I've been looking at ways for
> allowing users to construct their kanji based on parts in order to get
> dictionary results. I guess my usage falls in between SKIP and your idea/goal.
> It definitely is something I am personally interested in, though, and would
> love to help work on.
>
> --
> Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
> イェルーンラウフロックヴァンデルウェルヴェン
> http://www.in-nomine.org/ | http://www.rangaku.org/
> The intuitive mind is a sacred gift and the rational mind is a faithful
> servant. We have created a society that honors the servant and has
> forgotten the gift...
>

Next message: John H. Jenkins: "Re: Level of Unicode support required for various languages"
Previous message: Peter Constable: "RE: Level of Unicode support required for various languages"
In reply to: Jeroen Ruigrok van der Werven: "Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)"
Next in thread: vunzndi@vfemail.net: "Re: Component Based Han Ideograph Encoding (WAS: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Oct 30 2007 - 11:45:02 CST