Date: Mon Jan 25 2010 - 11:37:58 CST

    On Jan 24, 2010, at 10:59 AM, Ed Trager wrote:

    > What is the romanization system used for kJapaneseOn and kJapaneseKun
    > readings in the Unihan_Readings database?
    > I think that UAX#38 does not say ...

    kJapaneseOn and kJapaneseKun are two of the oldest fields in the Unihan database, with data provided initially by RLG and/or Xerox, and they've hardly been touched since. It's old enough that we don't even have adequate internal documentation, which is why the documentation in UAX#38 is so sparse.

    Personally, I feel that the best course in the long term would be to do what we did with the kKorean field: create a new field with known provenance and documented contents and deprecate the current fields. (By using katakana and hiragana, we could distinguish on and kun readings and use only one field.) I haven't had time to do this myself, but if someone were to volunteer data from a source we could use in the Unihan database, I'm sure the UTC would be willing to consider making the transition.

    John H. Jenkins

