CJK conversion problem

Date: Thu Jul 06 2000 - 16:57:58 EDT


I need to write a program to convert sort of all kinds of CJK encodings to
Unicode (UTF-8, to be precise). I got the Unihan3.0 file from
and made a quick analysis of the CNS characters in there. I used the
and kIRG_TSource tags, converted the codes to r/c and printed the result.
I must admit that I don't understand the results I get:
There are entries for plane 3, rows > 66 (which, according to Ken Lunde's
are not defined; plane 3 stops at row 66); OTOH, I found quite some
missing from plane 4, and almost all from planes 5, 6, 7, and 15.
My questions are: Why are so many characters missing?
And: what am I supposed to do if I encounter a text that uses these
AFAIK, there's no other way than to do a mapping from e.g. CNS to Unicode,
convert the resulting code points to UTF-8.

Any help?

