From: John Delacour (JD@BD8.COM)
Date: Sat Oct 11 2003 - 10:49:09 CST
> > Contact: firstname.lastname@example.org
> > Report Type: Other Question, Problem, or Feedback
> > My problem is to recognize from the 32 bit value of unicode
> > character if this is a chinese character or korean or japanese.
> How can do this?
You can tell if it is NOT from a legacy character set such as
shift_jis or big5 by failing to convert it to that character set. Or
you can look it up in unihan.txt
<http://www.unicode.org/Public/UNIDATA/Unihan.txt> (25 megabytes,
also at the ftp site). There are also Perl routines for getting at
U+4E01 kAlternateKangXi 0075.003
This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST