Re: FW: Web Form: Other Question: CJK

From: John Delacour (JD@BD8.COM)
Date: Sat Oct 11 2003 - 10:49:09 CST

> > My problem is to recognize from the 32 bit value of unicode
> > character if this is a chinese character or korean or japanese.
> How can do this?

You can tell if it is NOT from a legacy character set such as
shift_jis or big5 by failing to convert it to that character set. Or
you can look it up in unihan.txt
<> (25 megabytes,
also at the ftp site). There are also Perl routines for getting at
the information.

U+4E01 kAlternateKangXi 0075.003


