Re: pronunciation u+6784 ?!

From: Thomas Chan (tc31@cornell.edu)
Date: Fri Aug 23 2002 - 21:21:52 EDT


On Fri, 23 Aug 2002, Rafael Humpert wrote:

> the character u+6784 is listed in the Unihan Search Page with a
> Mandarin pronunciation of gou1. However, the corresponding complex
> writing form of u+69cb is spelled gou4, which is what I believe
> correct.
> What gives? Is there a mistake?

It's reasonable to think that U+6784 should have the same reading(s) as
U+69CB, since the former is a vulgar (and now, PRC simplified) form of the
latter, and I would agree that gou4 'structure' is correct according to
both PRC and ROC standards and their various dictionary compilers. The
_Guangyun_ fanqie says "qu" tone, so gou4 would have history on its side.

I don't know where the gou1 came from, but the _Hanyu Da Zidian_ (2: 1260)
says that U+69CB has been used to write a cognate word, gou1, which was
first a bamboo cover, and later a bonfire (usually written as U+7BDD).
Perhaps this is the source of gou1?--in which case, this would only
highlight the loss of data when the definitions and readings for a
character are not linked (as happens in some databases).

There's also a usage of U+69CB to write jue2 'rafter' (usu. written as
U+6877). But I don't know if U+6784 should automatically inherit any of
these two other usages of U+69CB.

There are a number of issues, including but not limited to:
  - synchronically, there are at least two major standards, PRC and ROC--
    see http://zhongwen.com/x/guopu.htm for some examples
  - individual dictionaries may differ, especially for rare characters
    (for extinct words) where a modern reading has to be reconstituted
  - diachronic differences from drawing upon data from older dictionaries--
    I suspect some of the definitions and readings might be drawn from the
    likes of the 1940s era Mathews' C->E dictionary
  - different readings for different words written with the same character
  - and of course, typos or flat-out mistakes

For such reasons, I don't use such data from the unihan.txt file except as
a starting point, but use the various dictionary page/index pointers to
look them up--although expensive and time-consuming it'd be.

Thomas Chan
tc31@cornell.edu



This archive was generated by hypermail 2.1.2 : Fri Aug 23 2002 - 19:37:05 EDT