From: John Jenkins (jenkins@apple.com)
Date: Tue Apr 20 2004 - 14:47:05 EDT
On Apr 19, 2004, at 8:40 PM, Ernest Cline wrote:
> For example, if there is a value of kIRGKungXi of the form
> XXXX.YY0 there will always be the same value for the
> kKangXi for that character and vice versa.
>
This is not a safe assumption. There are 37 cases where the kIRGKangXi
field ends in 0 but the kKangXi field is different. (There are 252
instances total where the two fields differ.)
> I'm trying to pare Unihan.txt down to a less unwieldy size
> for my own use by eliminating properties that are of no
> interest to me and would like to be certain that eliminating
> the four properties containing the actual values for those
> dictionaries can be done safely because the information
> can be reconstituted if necessary from the kIRG*
> properties since I'm not certain if those properties
> are of interest to me.
>
I'm not sure why you feel a need to recreate the four-dictionary
sorting algorithm in the first place because it's really arbitrary and
not all that useful in real life. In any even, it's (theoretically)
based on the kIRGxxxx fields. The others are needed really only if you
want to look the character up in the dictionary in question.
Also, even though the full Unihan database is 25+ Mb in size, given the
cheapness of disk space nowadays, it's not all *that* big, surely.
========
John H. Jenkins
jenkins@apple.com
jhjenkins@mac.com
http://homepage.mac.com/jhjenkins/
This archive was generated by hypermail 2.1.5 : Tue Apr 20 2004 - 16:52:42 EDT