Errors in Unihan?

From: Pierpaolo Bernardi (bernardp@CLI.DI.Unipi.IT)
Date: Tue Nov 14 2000 - 11:43:29 EST


Hello,

In the Unihan.txt database, in the kMandarin field there are entries
with duplicate pronunciations. For example:

U+4E21 kMandarin 1 LIANG3 2 LIANG3 3 LIANG4
U+4E4E kMandarin 1 HU1 HU2 2 HU1
U+4E86 kMandarin 1 LIAO3 2 LE LIAO3

Is there a reason for these duplicates? If this is the case, the
format of this field should be documented better in the header. If
these duplications are errors, I can supply a list of them.

Also, what's the meaning of the isolated numbers?

----------------

Other entries certainly contains errors, for example:

U+5594 kMandarin 1 WO1 2 01
                                ^ this is zero.

U+4EC0 kMandarin 1 SHI2 2 SHEN2 3 SHI2 SHIU2SHEN2 SHI2
                                              ^^^^ ?? --> shi2 shen2 ??

Regards,
  Pierpaolo Bernardi



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT