> I frequently consult the Unihan database to get detailed information
> about Japanese and Chinese characters, and I have noticed that at
> least some pages are encoded in more than one encoding, that is to
> say, although the main encoding is in "UTF-8" (as one would expect on
> the Unihan site), certain characters on those pages are encoded in
> "ISO-8859-1", which makes them unreadable until one forces a change
> of the encoding.
> I just looked at these pages:
> (character: 墳)
> (character: 墓)
> The wrongly encoded characters appear here in the Hanyu Pinyin
> column: the accented letters are from the ISO-8859-1 charset and not
> from UTF-8 and will only become legible if one changes the encoding
> setting to ISO-8859-1 (which renders, of course, much the rest of the
> page unusable)
> kHanyuPinyin 10485.060:fén,fèn
> kHanyuPinyin 10470.090:mù
> I suspect that the providers of this information would like to see
> all of it to be encoded in UTF-8 and that the current encoding scheme
> is just an accident. :-)

This is very odd. The UniHan data files, which can be downloaded and which
presumably drive that WWW service, have that information correctly coded.

Quoting from Unihan_Readings.txt (Unicode 6.0):

U+58B3 kCantonese fan4
U+58B3 kDefinition grave, mound; bulge; bulging
U+58B3 kHangul 분
U+58B3 kHanyuPinlu fen2(46)
U+58B3 kHanyuPinyin 10485.060:fén,fèn
U+58B3 kJapaneseKun HAKA
U+58B3 kJapaneseOn FUN
U+58B3 kKorean PWUN
U+58B3 kMandarin FEN2
U+58B3 kTang *bhiən
U+58B3 kVietnamese phần
U+58B3 kXHC1983 0322.071:fén

My guess is the WWW service is using a pre-release version
which had some coding errors.

My advice is to download the data and search it directly.

