Unihan DB / kKarlgren / kFrequency.

From: Pierpaolo BERNARDI (bernardp@cli.di.unipi.it)
Date: Sun Feb 23 2003 - 10:50:32 EST

    In the Unihan-3.2.0.txt file the field kKarlgren is described as:

    # The index of this character in _Analytic Dictionary of Chinese and
    # Sino-Japanese_ by Bernhard Karlgren, New York: Dover Publications,
    # Inc., 1974.
    # If the index is followed by an asterisk (*), then the index is an
    # interpolated one, indicating where the character would be found
    # if it were to have been included in the dictionary.

    However, in the file there are the following records:

    U+5374 kKarlgren 506A
    U+630C kKarlgren 411A
    U+811A kKarlgren 506A
    U+8173 kKarlgren 506A
    U+993C kKarlgren 333A-

    So, either the description of the field is incomplete, or the data
    is incorrect.

    The field kFrequency is described as:

    # A rough fequency [sic] measurement for the character based
    # on analysis of Chinese USENET postings

    without further explanation. The field contains one of 1,2,3,4,5.
    I'd like to know what's, roughly, the meaning of these numbers.


