Re: Unihan data for U+2B5B8 error

From: John H. Jenkins <>
Date: Wed, 19 Oct 2011 11:41:00 -0600

Andrew West 於 2011年10月19日 上午4:14 寫道:

> On 19 October 2011 10:43, shi zhao <> wrote:
>> The page said kTraditionalVariant of U+2B5B8 is U+9858 願.
> which is correct.
>> ) said U+2B5B8 𫖸 is kSimplifiedVariant of U+9858 願, U+613F 愿 is
>> kSemanticVariant, but 愿 is simplified of 願, not U+2B5B8 𫖸.
> which I agree is not correct. It's not always clear how asymmetrical
> cases like this should be handled. For U+9918 餘, which is analagous,
> with a common simplified form U+4F59 余 and an alternate simplified
> form U+9980 馀, the Unihan database lists them both as simplified
> variants of U+9918:
> U+9918 kSimplifiedVariant U+4F59 U+9980
> On this precedent, I would expect:
> U+9858 kSimplifiedVariant U+613F U+2B5B8

Actually, it's a bit more complicated than that. Note that the kSemanticVariant field for U+613F is actually "U+9858<kFenn", which means that Fenn's _Five Thousand Dictionary_ lists the two as semantic variants. (That should actually be "U+9858<kFenn:T", since Fenn indicates they are complete synonyms.) Fenn is a TC-only dictionary. Note, too, that U+613F has both kCihaiT and kGSR fields, also indicating that it is used in TC.

The HYDZD entry for U+613F first gives its old, TC definition (prudent, cautious—"愿,謹也" per the Shuowen), then it adds, "today used as the simplified form for 願").

So U+613F is a TC character in its own right meaning one thing, as well as the simplification/variant of another TC character meaning something else. What we should have, therefore, is:

U+613F kDefinition (variant/simplification of U+9858 願) desire, want, wish; (archaic) prudent, cautious
U+613F kSemanticVariant U+9858<kFenn:T
U+613F kSpecializedSemanticVariant U+9858<kHanYu:T
U+613F kTraditionalVariant U+613F U+9858
U+613F kSimplifiedVariant U+613F
U+9858 kSimplifiedVariant U+613F U+2B5B8
U+9858 kSemanticVariant U+9613F<kFenn:T

Andrew, does that look like it covers everything correctly?

> I suggest you report this issue on the Unicode Error Reporting form:
> <>

Always sage advice, since you can't count on there being anybody reading this mailing list who can make the change. When you do so, *please* include a source for your information. We get all kinds of offered corrections to the Unihan data which we can't use because there's no authoritative source.

John H. Jenkins
Received on Wed Oct 19 2011 - 12:44:07 CDT

This archive was generated by hypermail 2.2.0 : Wed Oct 19 2011 - 12:44:13 CDT