Unihan DB / kKarlgren / kFrequency.

From: Pierpaolo BERNARDI (bernardp@cli.di.unipi.it)
Date: Sun Feb 23 2003 - 10:50:32 EST

  • Next message: Werner LEMBERG: "symbols for `born' and `died'"

    In the Unihan-3.2.0.txt file the field kKarlgren is described as:

    # The index of this character in _Analytic Dictionary of Chinese and
    # Sino-Japanese_ by Bernhard Karlgren, New York: Dover Publications,
    # Inc., 1974.
    # If the index is followed by an asterisk (*), then the index is an
    # interpolated one, indicating where the character would be found
    # if it were to have been included in the dictionary.

    However, in the file there are the following records:

    U+5374 kKarlgren 506A
    U+630C kKarlgren 411A
    U+811A kKarlgren 506A
    U+8173 kKarlgren 506A
    U+993C kKarlgren 333A-

    So, either the description of the field is incomplete, or the data
    is incorrect.

    The field kFrequency is described as:

    # A rough fequency [sic] measurement for the character based
    # on analysis of Chinese USENET postings

    without further explanation. The field contains one of 1,2,3,4,5.
    I'd like to know what's, roughly, the meaning of these numbers.


    This archive was generated by hypermail 2.1.5 : Sun Feb 23 2003 - 11:42:57 EST